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INTRODUCTION 


A  square  system  of  linear  equations,  Bx  ■  b,  is  said  to  be  sparse 
if  it  can  be  solved  most  efficiently  through  a  knowledge  of  the  arrange¬ 
ment  of  zero  and  nonzero  elements  in  the  matrix  B.  A  system  is  pro¬ 
portionally  sparse  if  the  percentage  of  nonzeroes  is  significantly 
small;  it  is  structurally  sparse  if  the  nonzero  elements  are  confined  to 
certain  parts  of  B  in  a  regular  way.  When  Bx  ■  b  is  sparse,  B  is 
also  called  a  sparse  matrix. 

The  method  of  Gaussian  elimination  computes,  essentially,  a 
factorization 


B  -  LU 

where  L  is  lower  triangular  and  U  is  upper  triangular.  The  system 
Bx  «  b  is  thus  reduced  to  a  pair  of  triangular  systems,  L(Ux)  ■  b,  which 
are  easily  solved  by  back-substitution.  Sparse  Gaussian  elimination 
denotes  any  adaptation  of  this  basic  method  to  take  advantage  of  sparsity 
in  B.  In  general,  techniques  of  sparse  elimination  rely  on  finding  an 
L  and  U  that  inherit  B's  proportional  or  structural  sparsity. 

A  linear  system  is  said  to  have  a  staircase  structure  if,  roughly, 


the  nonzero  elements  of  B  are  confined  to  certain  blocks  on  or  just 
below  the  diagonal: 


Staircase  systems  arise  in  models  that  have  a  natural  sequence  of  stages 
— for  example,  planning  or  control  models  that  consider  similar  activi¬ 
ties  at  successive  times.  Clearly  a  staircase  system  (and  the  staircase 
matrix  B)  may  be  both  proportionally  and  structurally  sparse. 

This  paper  proposes  methods  of  sparse  Gaussian  elimination  for 
staircase-structured  systems.  These  methods  are  particularly  applicable 
to  the  linear  programming  problem 

minimize  cx 

subject  to  Ax=b,  *  S.  x  —  u 

where  the  m  *  n  matrix  A  has  a  staircase  structure.  The  simplex  method 
for  linear  programming,  applied  to  such  a  problem,  must  solve  numerous 
m  x  m  staircase  systems.  These  systems  are  usually  quite  large — m>  500 
is  common — and  are  sometimes  especially  difficult  [4,22,24].  Ihe  methods 
of  this  paper  may  also  be  applied  to  linear  systems  that  arise  in  non¬ 
linear  optimization  or  control,  especially  where  the  staircase  blocks 
are  dense  and  irregular. 


2 


The  £irst  three  sections  of  this  paper  assess  the  current  state 


of  affairs.  Section  1  reviews  sparse  Gaussian  elimination  generally, 
and  Section  2  derives  various  properties  of  nonsingular  staircase  matrices. 
Section  3  describes  the  behavior  of  standard  sparse-eliminatlon  methods 
on  staircase  systems. 

The  remaining  sections  consider  special  methods  for  staircase 
systems.  Natural  pivot  orders  for  Gaussian  elimination  of  staircase 
systems  are  developed  in  Section  4.  Based  on  these  orders.  Sections  5 
and  6  set  forth  two  new  approaches  to  staircase  elimination,  and  report 
initial  computational  experience. 
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1.  SPARSE  GAUSSIAN  ELIMINATION 
Principles  of  elimination 

The  solution  of  Bx  -  b  does  not  depend  on  the  ordering  of  Its 
variables  or  equations.  More  precisely,  for  any  choice  of  permutation 
matrices  P  and  Q,  the  system  Bx  ■  b  is  equivalent  to 

(PBQT)(Qx)  *  Pb 


This  latter  system  may  be  solved  by  Gaussian  elimination  on  PBQ  ,  which 
is  just  B  with  its  rows  and  columns  rearranged. 

The  importance  of  choosing  P  and  Q  is  seen  by  considering 
the  computation  Involved  in  elimination.  Its  basic  calculations  are  de¬ 
fined  by  the  following  recursion: 


PBQ 


.(k+1)  (k)  <k)  (k).  (k) 

ij  eij  ‘  /ekk 


i,  j  >  k;  k  -  1 . m-1 


of  which  L  and  U  are  a  by-product: 


LU  eU 


(i).  <i) 

Uij  Bij  /Bii  * 


The  crucial  values  are  the  "pivots”  6 


i  >  j 

i  <  j 


00 

kk 


an  LU  factorization  exists 


if  and  only  if  all  pivots  are  nonzero.  Moreover,  elimination  is  numeri¬ 


cally  stable  only  if  all  pivots  are  sufficiently  large  in  magnitude, 

<k] 
kj 


(k)  (k) 

both  absolutely  and  relative  to  or  .  As  a  consequence, 


A 


practical  Gaussian  elimination  looks  for  permutations  P  and  Q  such 
T 

that  PBQ  has  an  acceptably  large  series  of  pivots. 

Sometimes  it  is  convenient  to  look  at  elimination  in  a  slightly 

different  way.  B  is  not  permuted;  but  the  kth  pivot  may  be  any  non- 
(k)  (k) 

zero  element  0  „  .  ...  of  B  .  The  recursion  functions  for  ellml- 

p(k) ,q(k) 

nation  then  take  on  the  following  general  form: 
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(1) 


B 


B 


(k+1) 


(k) 


p(D.q(j)  p(D,q(j) 


fi(k)  ft(k>  .  (k) 

p(i) ,q(k)  p(k),q(j)'  p(k),q(k) 


p(i),q(j)  >  k;  k  -  1,...,  m-1 


L  and  U  are  yielded  as  before: 


l  -  e(j) 
ij  P(i),q(j)  ’ 


p(i)  >  q ( j ) 


Uij  "  Bp(i),q(j)/Bp(i),q(i)  ,  P(1)  -q(j) 


Choosing  the  functions  p  and  q  is  called  pivot  selection.  It  is 

easily  seen  that  for  every  p,  q  there  are  permutation  matrices  P,  Q 

that  produce  the  same  L  and  U,  and  vice  versa:  p  is  just  the  row 

permutation  produced  by  P  \  and  q  the  column  permutation  produced 
-T 

by  Q  .  Thus  permuting  B  and  selecting  pivots  from  B  are  equivalent 
ways  of  looking  at  elimination,  and  hereafter  these  two  points  of  view 
will  be  used  interchangeably  as  convenience  dictates. 
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Practical  schemes  for  pivot  selection  are  of  two  general  types. 
Dynamic  selection  defers  choosing  the  kth  pivot  until  the  kth  step  of 
the  recursion.  Preselection  chooses  all  or  part  of  the  pivot  order  in 


advance  and  modifies  its  chosen  order  only  if  a  pivot  proves  unacceptable. 

(k) 

Dynamic  selection  is  the  more  powerful  approach,  as  it  may  examine  3 
to  choose  p(k)  and  q(k);  but  in  many  cases  preselection  can  yield 
acceptable  results  more  simply  and  cheaply. 

For  general  (non-sparse,  indefinite,  non-symmetric)  B,  the  sole 
aim  of  pivot  selection  is  to  insure  numerical  stability.  For  this  purpose 
the  preferred  scheme  is  a  compromise  between  dynamic  selection  and 
preselection.  The  row  pivot  order,  for  example,  is  chosen  in  advance; 


the  kth  pivot  column  is  then  chosen  dynamically  to  maximize  1 3 


(k) 


P(k) , j 

over  the  remaining  columns  J.  Alternatively ,  the  column  pivot  order 
is  chosen  in  advance  and  the  kth  pivot  row  maximizes  I ^ i^q(k.) I  over  i* 

(Full  dynamic  selection  might  maximize  |  over  all  remaining  i 

and  j.  It  would  be  more  costly,  however,  and  would  usually  gain  little 
in  stability.) 


Principles  of  sparse  elimination 

The  fundamental  observation  of  sparse  Gaussian  elimination  is 
the  following:  sparsity  of  the  factors  L  and  U  is  highly  sensitive 
to  the  choice  of  pivots  in  B.  A  well-known  example  is  the  matrix  whose 
nonzeroes  are  arranged  like  this: 


If  pivots  are  taken  in  order  down  the  diagonal,  L  and  U  retain  all  of 
the  sparsity  of  B: 


But  if  pivots  run  up  the  diagonal,  L  and  U  are  very  likely  100%  dense'. 
Experiments  with  less  structured  matrices  of  various  sorts  [11,13,14] 
have  shown  that  proportional  sparsity  of  L  and  U  can  also  vary  widely 
with  pivot  selection.  The  central  task  of  sparse  Gaussian  elimination, 
therefore,  is  choosing  pivots  to  preserve  sparsity  as  well  as  numerical 
stability. 

The  relation  of  pivot  selection  to  sparsity  is  best  explained  by 
referring  again  to  the  recursion  equation: 


e 


(k+l) 

ij 


_  g(k)  (k). 
6ik  Bkj  /B 


(k) 

kk 


If  6^  -  0,  then  e^+1)  -  0  only  if  -  0  or  g^  -  0;  other¬ 
wise,  "fills  in"  with  a  non-zero  element.  On  the  other  hand, 
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if  6 ^ ^  ^  0  then  also  ^  0  unless,  fortuitously,  (5^  = 

B^B.^/B,^;  in  the  latter  case  is  "cancelled"  back  to  zero, 

ik  kj  kk  ij 

In  practice  cancellation  is  fairly  rare,  and  so  it  is  legitimate  to  con¬ 
sider  only  the  "worst  case"  in  which  there  is  fill-in  but  no  cancellation, 
(k) 

The  submarices  B  then  become  progressively  denser.  And,  since 

and  all  nonzero  fill-in  is  eventually 

reflected  in  L  or  U.  In  the  best  possible  case,  there  is  no  fill-in 

and  L,  U  have  he  same  nonzeroes  as  the  upper  and  lower  triangles  of  B. 

(k) 

At  worst,  every  element  eventually  fills  in  for  some  k,  and  L, 

U  are  100%  dense.  (The  example  above  shows  both  best  and  worst,  for 
different  pivot  choices.) 

Preserving  the  sparsity  of  L  and  U  is  thus  essentially  a  job 
of  controlling  fill-in.  Other  methods  of  direct  solution — for  example, 
Gauss-Jordan  elimination  and  orthogonal  factorization  by  Givens  or 
Householder  transformations — also  suffer  from  fill-in.  Of  all  popular 
methods,  Gaussian  elimination  controls  fill-in  the  best,  and  hence  it 
is  preferred  for  solving  large  sparse  systems.  Under  suitable  assumptions, 
Gaussian  elimination  can  also  be  shown  to  require  minimal  storage  among 
all  algorithms  that  compute  L  and  U  [6]. 

Minimizing  fill-in  in  Gaussian  elimination  implies  a  number  of 
economies.  Certainly,  it  means  that  a  minimal  amount  of  storage  is  re¬ 
quired  to  hold  the  nonzeroes  of  L  and  U.  It  also  generally  means 
that  L  and  U  can  be  computed  faster,  since  only  additions  and  multi¬ 
plications  with  nonzeroes  must  be  carried  out.  Solution  of  L(Ux)  =  b 
by  back-substitution  is  likewise  more  efficient  when  L,  U  are  sparser. 
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(Minimizing  fill-in  is  not  equivalent  to  minimizing  additions  and  multi¬ 
plications,  and  there  may  be  no  pivot  order  that  minimizes  both.  However, 
for  sufficiently  large  and  sparse  B  the  difference  between  minimal 
fill-in  and  minimal  computation  is  negligible  [7].) 

When  B  has  some  s*  ucture,  it  may  also  be  worth  controlling 
fill-in  so  that  the  structur.  is  preserved  in  L  and  U.  Often  structure 
can  be  exploited  to  further  reduce  the  computation  involved  in  elimination 
and  in  back  substitution. 

It  remains  to  say  how  to  find  a  pivot  order  that  minimizes  fill-in. 
Unfortunately,  for  a  general  B  there  is  no  known  algorithm  that  minimizes 
fill-in  without  essentially  looking  at  all  m'.  possible  pivot  orders. 
Moreover,  the  minimal-fill- in  problem  has  been  shown  to  be  NP-complete 
[43],  so  there  is  little  hope  of  finding  an  efficient  algorithm  for  it. 

Consequently,  practical  methods  for  sparse  elimination  use 
heuristic  techniques  to  keep  fill-in  low,  if  not  necessarily  minimal. 
Successful  techniques  are  of  two  kinds.  Structural  techniques  first 
permute  B  to  reveal  some  special  structure,  then  preselect  a  pivot 
order  known  to  be  good  for  that  structure.  Techniques  of  local  minimiza¬ 
tion  (or  merit  techniques)  dynamically  select  each  pivot  to  minimize 

(k) 

fill-in  or  some  related  function  of  8  .  Both  of  these  approaches  are 

described  further  below,  after  which  the  two  are  briefly  compared. 

Regardless  of  technique,  some  compromise  with  numerical 
stability  must  be  made  in  order  to  promote  sparsity.  In  practice  a 
compromise  is  achieved  by  weakening  the  standard  for  an  acceptable  pivot 
element:  rather  than  being  the  element  of  largest  magnitude  in  its  row 
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or  column  of  6 


(k) 

'  ' ,  the  pivot  need  only  be  "not  too  small"  relative  to 

(k) 

the  element  of  largest  magnitude.  More  precisely,  8^  is  acceptable 
as  a  pivot  element  if  it  exceeds  some  absolute  tolerance  e  and  if 
either 

6kk}  2  .  for  a11  i  >  k 

or 

Bkk}  -  6eik}  *  for  a11  1  >  k 

-1  -4 

Values  of  6  from  10  to  10  have  been  found  to  afford  reasonable 
stability  and  sparsity  [13,15,54]. 


Structural  techniques 

Sparse  elimination  was  first  applied  to  particularly  strong  and 
regular  structures  arising  from  models  of  various  physical  systems.  In 
these  cases  a  "natural"  best  pivot  order  could  often  be  found  by  inspec¬ 
tion.  For  example,  a  1961  survey  by  Parter  [40]  analyzes  tridiagonal, 
five-diagonal  and  various  "looped"  systems,  as  well  as  systems  that 
arise  from  Laplace  and  biharmonic  equations.  More  recently  Varah  [55,56] 
and  Keller  [35]  have  investigated  pivot  orders  for  block-tridiagonal 
and  special  staircase  systems  that  arise  in  solution  of  the  two-point 
boundary-value  problem. 

Many  other  applications  yield  systems  that  have  no  such  strong 
structure.  Yet  these  systems  often  do  have  weaker,  more  general 
structure  that  can  be  exploited  to  choose  good  (if  not  best)  pivots 
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u 


I. 


and  to  save  storage  and  computation.  Structures  of  this  sort  are 
usually  not  evident  from  inspection;  they  must  be  found  through  syste¬ 
matic  analysis  of  the  matrix  B  by  one  or  more  specially-designed 
algorithms .  To  be  practical  these  algorithms  must  themselves  be  pro¬ 
grammed  for  the  computer,  so  that  they  may  be  run  just  prior  to  the 
elimination  algorithm. 

Such  algorithms  have  been  proposed  for  several  kinds  of 
structures.  For  example,  Alway  and  Martin  [1]  considered  looking  for 
a  banded  form  of  a  matrix;  Weil  and  Kettler  [58]  proposed  a  way  of 
permuting  matrices  to  block-angular  form. 

One  particular  general  structure — the  block-triangular  form — 
is  by  far  the  best  known  and  most  widely  applied.  To  define  block- 
triangularity  formally,  one  specifies  a  partition  (m^,...,m  ),  £^m^  =  m, 
of  both  the  rows  and  columns  of  B;  this  induces  a  partition  of  B  into 


m  blocks  B : 


B11 

B12 

b13 

B14 

B15 

B21 

B22 

B23 

B24 

B25 

B31 

B32 

B33 

B34 

B35 

B41 

B42 

B43 

B44 

B45 

B51 

B52 

B53 

B54 

B55 
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Partitioned  in  this  way,  B  is  (lower)  block-triangular  if  i  <  j 
implies  *  0: 


B11 

- 1 

1 

B22 

1 

i 

1 

B33 

1 

B44 

1 

B55 

When  B  is  nonsingular,  so  is  each  of  the  square  diagonal  blocks  B^ 
Given  a  block-triangular  partition  of  B,  there  is  a  natural 
good  way  to  choose  pivots:  first  pick  pivots  in  B^,  then  m2 

pivots  in  B22,  and  so  forth  through  B^.  It  is  easily  seen  that 
pivots  in  B^  will  then  cause  fill-in  within  blocks  B_^,  ®i+l  i* 
...,  Bfci  only,  and  that  the  total  fill-in  of  B  will  thereby  tend  to 
be  limited.  In  fact  elimination  of  B  has  no  effect  at  all  upon 
columns  in  subsequent  partitions,  so  that  there  is  a  reduction  in  com¬ 
putation  as  well.  Moreover,  the  resulting  factor  U  has  a  block- 
diagonal  form: 
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_ r.., 


.  tAulVv  -Mf’ 


that  makes  the  back-substitution  particularly  easy.  Better  yet,  it  may 
be  possible  to  choose  the  partition  so  that,  for  some  values  of  i, 

=  1.  Then  there  is  just  one  "triangle  column"  in  partition  i;  this 
column  appears  unchanged  in  L,  while  the  corresponding  column  of  U 
is  just  a  unit  vector. 

(There  is  also  a  block-form  of  the  elimination  algorithm  [52] 
that  just  factors  f°r  eac^*  diagonal  block.  This  may  be  even 

sparser  and  more  efficient  than  standard  elimination  [17,23],  but  it 
does  require  that  B  be  available  during  back-substitution.) 

In  light  of  the  above  observations,  it  may  pay  to  look  for  a 
"best"  block-triangular  permutation  of  B.  Certainly  a  best  permutation 
should  be  "irreducible":  no  diagonal  blcok  B^  should  itself  be 
permutable  to  block-triangular  form  with  more  than  one  block.  A  best 
permutation  should  also  have  as  many  diagonal  blocks  as  possible. 
Fortunately,  B  always  has  a  permutation  best  in  both  of  these  senses, 
and  essentially  unique:  if  any  other  permutation  is  just  as  good,  it 
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must  have  the  same  diagonal  blocks  (possibly  in  a  different  order),  and 
the  same  rows  and  columns  in  each  block.  Thus  it  is  proper  to  speak  of 
the  block-triangular  reduction  of  B.  (Existence  of  this  reduction  was 
first  shown  by  Dulmage  and  Mendelsohn  [18,19],  as  a  by-product  of  their 
study  of  structures  of  bipartite  graphs,  and  it  has  since  been  demonstrated 
in  other  ways.) 

The  block-triangular  form  of  any  nonsingular  B  may  be  found 
systematically  in  two  steps: 

(1)  Find  a  permutation  P  so  that  BP  has  no  zeroes  on  its 
diagonal.  This  step  is  called  finding  a  transversal  of 
B,  and  it  can  be  accomplished  for  any  nonsingular  B. 

T 

(2)  Find  a  permutation  Q  so  that  Q  (BP)Q  is  block- triangular 
with  as  many  diagonal  blocks  as  possible.  This  is  equivalent 
to  the  problem  of  finding  and  ordering  the  strong  components 
of  the  graph  of  BP  (or  the  irreducible  classes  of  the  Markov 
chain  whose  transitions  are  defined  by  the  nonzeroes  of  BP). 

T 

Dulmage  and  Mendelsohn  [20,21]  showed  that  the  resulting  matrix,  Q  B(PQ) , 
must  be  the  block-triangular  reduction  of  B.  Step  (1)  requires  only 
a  column  permutation,  and  step  (2)  only  a  symmetric  permutation,  and 
both  are  inherently  simpler  than  finding  a  general  permutation  of  the 
rows  and  columns  of  B.  Moreover,  both  steps  involve  well-known  problems 
for  which  there  are  efficient  algorithms.  Duff  [12]  reviews  transversal 
algorithms;  Duff  and  Reid  [16]  and  Gustavson  [26]  describe  implementations 
of  the  method  of  Tarjan  [49]  for  finding  strong  components. 

The  computational  effort  involved  in  each  algorithm  has  been 
obeserved  to  be  a  roughly  linear  function  of  the  number  of  rows  of  B 
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and  of  its  number  of  nonzero  elements.  Since  the  comparable  effort  in 
Gaussian  elimination  is  at  best  a  quadratic  function,  block-triangular 
reduction  is  relatively  cheap  for  large  systems  [13,17]. 

There  remains  the  problem  of  choosing  pivots  within  each  diagonal 
block  B^,  whenever  >  1.  This,  too,  is  amenable  to  structural 
techniques.  A  number  of  ideas  have  been  put  forth  [34,48,53];  the  only 
approach  that  is  widely  used,  however,  is  due  to  Hellerman  and  Rarick 
[27,28].  It  is  based  on  the  observation  that,  even  within  a  block  B^, 
any  triangle  column — one  that  has  no  elements  above  the  diagonal — appears 
unchanged  in  L  and  cannot  fill  in.  All  fill-in  is  confined  to  "spike" 
columns  that  have  elements  above  the  diagonal.  The  idea,  therefore,  is 
to  permute  each  so  that  most  columns  are  triangle  columns,  while  a 

relative  few  are  spikes.  B  then  has  a  "bump  and  spike"  structure  such 
as  this : 


and,  as  before,  B's  structure  is  inherited  by  the  factor  U: 


Only  the  spike  columns  in  U  need  be  stored,  since  all  others  are 
unit  vectors. 

No  effective  algorithm  is  known  for  finding  a  best  spike  structure 
in  any  useful  sense.  It  is  therefore  necessary  to  rely  on  heuristics  that 
seem  to  find  a  good  structure.  The  standard  for  this  purpose  is  Hellerman 
and  Rarick's  Preassigned  Pivot  Procedure  (P3) .  As  implemented  in  various 
linear  programming  codes,  it  is  regularly  observed  to  reduce  the  pro¬ 
portion  of  spikes  in  a  bump  to  10-20%. 

P3  does  not  have  the  inherent  efficiency  of  the  algorithms  for 
transversal  or  strong  components.  However,  P3  is  applied  only  to  the 
blocks  B^,  not  to  the  whole  matrix  B.  Hence  P3's  efficiency  is  accept¬ 
able  provided  that  B's  block-triangular  reduction  has  no  very  large 
diagonal  blocks.  This  is  -ften — but  not  always — the  case  in  practice, 
as  the  staircase  examples  of  Sections  3  and  5  will  show. 

Special  cart  must  be  taken  to  maintain  the  bump  and  spike  structure 
in  the  face  of  numerical  instability.  Any  rearrangement  of  the  rows  will 
tend  to  destroy  the  structure;  consequently  a  too-small  pivot  must  be 
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avoided  only  through  changes  in  the  column  order.  Practical  implementa¬ 
tions  employ  the  following  strategy:  if  the  kth  pivot  turns  out  to  be 

unacceptably  small,  exchange  column  k  with  a  later  column  s  such  that 
(k) 

>  e.  To  promote  stability,  s  is  usually  also  chosen  so  that 

B^k)  >  6B^k)  for  j  >  k. 
ks  —  kj  J 


(k) 


It  is  easily  shown  that  "0  if  j  is  a  triangle  column 

or  if  j  and  k  lie  in  different  bumps.  Thus  column  k  may  only  be 
exchanged  with  a  spike  column  s  in  the  same  bump.  Such  an  exchange  is 
called  a  "triangle  swap"  or  a  "spike  swap"  according  to  whether  k  is 
a  triangle  or  spike  column.  A  triangle  swap  may  increase  the  number  of 
spikes  by  one,  whereas  a  spike  swap  leaves  the  number  of  spikes  unchanged. 
(Triangle  swaps  may  be  safely  avoided,  however,  provided  that  the  ratio 
of  largest  to  smallest  element  in  any  column  of  the  LP  does  not  exceed 
1/6.) 


When  the  pivot  order  is  determined  by  P3,  it  sometimes  happens 
that  a  spike's  pivot  element  is  identically  zero.  This  is  generally  not 
a  sign  of  numerical  instability,  but  is  due  rather  to  an  unfortunate 
choice  of  spike  order.  A  suitable  spike  swap  will  remedy  the  situation. 


Techniques  of  local  minimization 

Markowitz  [37]  first  suggested  that  a  sparse  pivot  order  might 

be  found  by  minimizing  some  criterion  at  each  pivot.  Methods  of  this 

sort  consider,  at  the  kth  elimination  step,  all  nonzero  elements  of 
(k) 

6  that  are  large  enough  to  serve  as  pivots;  they  associate  with 
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Tf 


each  such  element  an  appropriate  "merit  function,"  and  take  as  the  kth 
pivot  an  element  that  minimizes  this  function.  Thus  each  pivot  Is 
"locally"  optimal  In  some  sense,  and  It  can  be  hoped  that  the  resulting 
pivot  order  is  close  to  globally  optimal. 

Two  obvious  candidates  for  merit  function  are  total  fill-ln  due 


to  the  kth  pivot,  and  total  arithmetic  operations  in  the  kth  step  of 
the  elimination.  It  was  thought  at  first  that  minimizing  either  of  these 
functions  locally  might  necessarily  minimize  it  globally,  but  simple 


counterexamples  were  demonstrated  by  Douglas  [10]  and  Bertele  and 
Brioschi  [5]. 


Computing  total  fill-in  or  total  arithmetic  for  a  pivot  in  0 


(k) 


is  a  fairly  time-consuming  procedure.  Consequently,  there  have  been 

(k) 

proposals  for  simpler  merit  functions  that  rely  only  on  r^  ,  the  number 

n  r< 

(k) 


(k)  (k) 

of  nonzeroes  in  row  i  of  B  ,  and  c^  ,  the  number  of  nonzeroes  in 


colunn  j  of  6 
follows : 


Two  typical  functions  of  this  sort  are  defined  as 


•  Minimize  (r^^  _  l)(cj^^  -  1)  over  all  potential  pivots  0 ^ ^ . 
This  function,  the  one  proposed  originally  by  Markowitz, 

(k) 

is  a  crude  upper  bound  on  the  fill-in  due  to  a  pivot  on  0 ^  . 

It  is  also  approximately  the  number  of  multiplications  and 
divisions  required  by  the  pivot. 


•  Pick  a  pivot  colunn  j*  that  minimizes  c^ 

(k)  j 


(k) 


then  pick  a 


pivot  row  i*  that  minimizes  r' 


8 


(k) 

ij*' 


minimize  c 


Or,  first  pick  i* 
(k) 


to  minimize 


over  all  potential  pivots 

(k) 


r ,  then 
1  (k) 


j’ 


to 


j 


over  all  potential  pivots  s!*, 

i  j 
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Experiments  with  these  and  similar  criteria  have  been  reported 
by  Tewarson  [SO],  Dantzig  et  al.  [9],  Curtis  and  Reid  [8],  and  Duff  and 
Reid  [14].  The  total-fill-in  and  total-arithmetic  merit  functions  have 
generally  performed  better  than  the  simpler  functions,  but  only  moderately 
so ;  no  function  has  been  consistently  the  best.  Markowitz'  original 
merit  function  offers  perhaps  the  best  combination  of  effectiveness  and 
simplicity  of  those  studied. 

In  a  few  experiments  local  minimization  has  produced  less  fill-in 

than  the  bump-and-spike  structural  techniques  [23];  however,  there  are 

no  definitive  comparisons.  Certainly  local  minimization  has  the  advantage 

(k) 

of  being  able  to  take  any  nonzero  of  8  as  a  pivot,  without  regard 

to  preserving  some  structure;  on  the  other  hand,  it  cannot  use  structure 

to  advantage.  Its  greatest  drawback  is  its  need  to  know  where  all  the 
(k) 

nonzeroes  are  in  8  ,  while  efficient  structural  techniques  [22]  only 

(k)  (k) 

explicitly  compute  column  k  of  8  •  For  local  minimization  8  may 

(k) 

share  storage  with  L  and  U — since  8  shrinks  as  L  and  U  are 

determined — but  such  an  arrangement  requires  a  "cushion"  of  unused 

storage  space,  plus  special  storage-management  routines  [17]. 

(k) 

If  the  sparsity  pattern  of  8  is  not  available,  it  is  still 


possible  to  do  a  sort  of  local  minimization  based  on  r^ 
original  numbers  of  nonzeroes  in  the  rows  and  columns  of 


and  Cj ,  the 
B.  Typically 


the  columns  are  ordered  in  advance  of  the  elimination,  according  to  some 


merit  function;  as  the  elimination  progresses,  a  pivot  row  is  chosen  for 
each  column  according  to  a  second  function.  (Of  course,  the  roles  of 
rows  and  columns  may  be  interchanged.)  Duff  and  Reid  [14]  found  that 


1° 


1 


taking  Just  as  the  column  merit  function,  and  and  the  row 

merit  function,  yielded  results  about  as  good  as  other  more  complex 

functions  proposed  by  Tewarson  [51]  and  others.  However,  they  also 

(k) 

determined  that  substituting  r^  for  r^  gave  generally  better  results, 

(k) 


(k)  (k) 

and  that  full  use  of  r^  and  c^  worked  best  of  all. 


Another  way  to  avoid  carrying  the  full  sparsity  pattern  of  0 
is  to  combine  local  minimization  with  a  structural  technique.  For 
example,  B  may  first  be  permuted  to  block-triangular  form,  then  local 
minimization  may  be  applied  to  choose  pivots  in  each  diagonal  block.  Gay 
[23]  and  Duff  and  Reid  [17]  report  experiments  with  this  hybrid  technique 
Other  hybrids  are  appropriate  to  staircase  elimination  as  shown  in 
Section  6  of  this  paper. 


Considerations  in  linear  programming 

Computer  codes  for  linear  programming  use  Gaussian  elimination 
to  compute  an  LU  factorization  of  the  basis  matrix.  Since  the  basis  changes 
by  just  one  column  at  each  iteration,  it  is  essential  to  be  able  to  update 
L  and  U  at  most  iterations;  a  fresh  elimination  is  carried  out 
only  occasionally  (typically  every  50-100  iterations)  when  the  updated 
representations  of  L  and  U  become  too  large  or  inaccurate. 

Methods  for  updating  L  and  U  are  based  on  the  ideas  of  Bartels 
and  Golub  [2,3],  but  differ  considerably  in  details.  As  a  result,  they 
favor  quite  different  approaches  to  sparse  elimination. 
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Saunders' updating  scheme  [45,46]  is  designed  to  allow  L  and 
most  of  U  to  be  stored  sequentially.  It  is  predicated  upon  a  favorable 
bump  and  spike  structure,  and  it  exploits  the  fact  that  triangle  columns 
of  B  give  rise  to  trivial  unit  vectors  in  U.  Consequently,  it  favors 
the  structural  techniques  of  sparse  elimination. 

Reid's  updating  scheme  [42]  maintains  sparser  factors  at  the  cost 
of  random  access  to  U.  It  works  best  when  the  bulk  of  nonzeroes  is  in 
U,  and  so  prefers  a  local-minimization  technique  adjusted  to  favor  fill- 
in  within  U.  It  is  not  so  suitable  to  the  usual  bump- and- spike  tech¬ 
nique,  which  puts  most  non-zeroes  in  L;  but  it  might  profitably  use  an 

upper  block-triangular  structure  that  arises  from  applying  the  same 
T 

techniques  to  B  . 

These  observations  suggest  that  no  technique  of  sparse  elimination 
can  be  considered  best  on  the  bases  of  fill-in  and  computational  cost 
alone.  It  will  probably  always  be  necessary  to  consider  different  tech¬ 
niques  for  different  situations. 
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2.  STAIRCASE  SYSTEMS 


Staircase  matrices 

A  "staircase"  matrix  is  one  whose  nonzeroes  are  confined  to 
certain  submatrices  centered  roughly  on  and  just  off  the  diagonal — 
as,  for  example: 


B11 

~1 

B21 

B22 

B32 

B33 

B43 

B44 

B54 

B55 

_ 

B65 

B66 

More  precisely,  one  partitions  the  rows  of  an  m  x  m  matrix  B  into  t 

disjoint  subsets,  and  the  columns  into  t  (generally  different)  disjoint 

2 

subsets,  so  that  the  matrix  is  partitioned  into  t  submatrices,  or 
"blocks": 


B 

i 

i 


B 


ij’ 


i  = 


is  lower  staircase  (as  above)  if 
*  j+1.  B  is  upper  staircase  If 

-  j-l. 


1,*».,  tj  j  =  1 ,  •  •  • , 

B^j  =  0  except  for 
B^  =  0  except  for 


t 

1  *  j 
i  -  j 


and 

and 
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Any  lower-staircase  matrix  may  be  permuted  to  upper-staircase 
form,  by  reversing  che  order  of  the  partitions: 


- 1 

B66  B65 

!  855 

B54 

B44  B43 

B33 

B32 

B22 

B21 

1 _ 

B11 

Likewise,  any  upper  staircase  may  be  permuted  to  lower-staircase  form. 
The  two  forms  are  thus  equivalent,  and  either  may  be  considered  without 
]oss  of  generality.  This  paper  adopts  the  lower-staircase  form,  and 
henceforth  a  lower  staircase  matrix  will  be  called  just  a  "staircase 
matrix" . 

If  a  row  of  the  ith  partition  is  entirely  zero  within  B^, 

that  row  may  be  moved  back  to  the  (i-l)st  partition  without  loss  of 

the  staircase  structure.  Similarly,  a  colutiu.  of  the  jth  subset  that 

is  all-zero  within  B  may  be  moved  to  the  (j+l)st  partition.  A 

nonsingular  staircase  B  may  thus  be  permuted  so  that  its  diagonal 

blocks  B,  ,  have  no  all-zero  rows  or  columns.  B  is  then  said  to  be 
kk 

in  standard  staircase  form. 
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The  off-diagonal  blocks  B^^  ^  may  have  all-zero  rows  or 

columns  whether  B  is  in  standard  form  or  not.  It  is  revealing  to 

further  permute  the  rows  of  the  ith  partition  so  that  the  nonzero  rows 

of  B  come  first,  and  to  permute  the  columns  of  the  jth  partition 

i,  i— 1 

so  that  the  nonzero  columns  of  B  .  .  come  last.  Then  B  has  the 

3+i»3 

following  reduced  form: 


B,  .  ,  is  iust  the  submatrix  formed  from  the  nonzero  rows  and  colunns 
k+1 , k  J 

of  B,  ,  ,  .  The  rows  and  columns  of  B  that  have  nonzeroes  in  B,  , 
k+l,k  k+1 ,k 

are  the  linking  rows  and  columns  between  partitions  k  and  k+1. 

If  the  linking  rows  of  every  partition  k  are  moved  back  to 
partition  k-1,  then  B  gains  an  alternative  row-upper-staircase  form: 
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Moving  the  linking  columns  of  partition  k  to  partition  k+1  gives  a 
different,  column-upper- stair case  form.  Thus  any  one  permutation  of  a 
staircase  matrix  in  reduced  standard  form  embodies  three  staircases: 
lower,  row-upper,  and  column-upper.  Each  corresponds  to  a  different 
choice  of  partitions. 

The  staircase  partitioning  of  B  is  in  no  way  unique.  Even  con¬ 
fined  to  standard  form,  the  choice  is  considerable;  for  example,  either 
the  row-upper  or  column-upper  variant  may  be  permuted  to  a  lower  stair¬ 
case  and  put  in  its  own  standard  form,  which  is  generally  different  from 
the  original.  It  is  to  be  expected  that  staircase  matrices  arising  In 
applications  will  have  certain  natural  standard-form  partitions  that  are 
best  to  work  with,  but  there  is  no  obvious  way  to  choose  the  "best" 
staircase  partition  in  general. 
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There  may  be  other  candidates  for  a  "standard"  staircase  form  as 


well.  This  paper's  definition  of  reduced  standard  form  was  chosen  for 
two  reasons.  First,  it  seems  to  correspond  to  an  intuitive  way  of  think¬ 
ing  about  staircase  systems:  the  diagonal  blocks  are  full,  separate 
sets  of  equations,  linked  together  by  the  smaller  off-diagonal  blocks. 
Second,  a  single  permutation  of  B  suffices  to  display  the  reduced 
standard  form  and  its  upper-staircase  variants.  Methods  of  sparse 
Gaussian  elimination  also  seek  a  permutation  of  B,  and  this  staircase 
permutation  offers  a  good  starting  point  as  subsequent  sections  will 
show. 


The  number  of  rows  in  partition  i  will  be  denoted  m^,  and 

the  number  of  columns  in  partition  j  will  be  n_. ;  analogously,  the 

numbers  of  linking  rows  and  columns  will  be  nu  and  n^ .  The  number  of 

rows  in  partition  i  of  the  row-upper  form  will  be  m*,  and  the  number 

of  columns  in  partition  j  of  the  column-upper  form  will  be  n^ . 

Necessarily,  £m  =  £  m*  =  £  n .  =  £  n^  =  m,  and  m,  <_  m. ,  n  <_  n. .  By 
1  J  1  1  J  J 

definition. 


1 

1 

* 

m 

“  n^2  > 

n 

=  ni  -  ni 

9 

i 

m 

=  mt  +  »1+1  - 

mi 

,  nj 

=  n  -  n. 
j  J 

+  "j-i  ■ 

i  >  j  =  2 . t-1 

t 

- 

t 

- 

m 

=  ~  , 

n 

=  n  +  n 

t  t 

t  t 

-1  ' 

\ 

X  nk’  Bk+l,k 

is 

\+i  x 

n^,  and 

Bk+l,k  ls 

“k+l  X  V 
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Staircase  systems 

Staircase  matrices  B  arise  from  staircase  systems  Bx  *  b. 

In  the  majority  of  practical  staircase  systems,  the  staircase  partitions 
represent  successive  periods  of  time.  Hence  "period"  means  the  same  as 
"partition"  in  describing  staircase  structures. 

Frequently  the  source  of  a  staircase  system  is  a  "multi-period" 
or  "time-staged"  linear  model  of  a  physical  or  economic  process.  For 
example,  an  economic  planning  model  might  minimize  costs  over  all  periods, 
subject  to  constraints  governing  each  period  and  its  relation  to  the  next 
period.  If  the  minimum  is  found  by  some  variant  of  the  simplex  method, 
there  may  well  be  in  excess  of  2m  iterations,  each  involving  solution 
of  a  staircase  system  with  a  different  B. 

Staircase  systems  are  also  naturally  associated  with  linear  con¬ 
trol.  Indeed,  a  simple  discrete  dynamic  model  with  state  vectors  x^ 


i+1 


=  AiXi  + 


V 


i  =  1... 


t 


is  readily  seen  to  be  a  staircase  system.  General  models  of  optimal 
control  give  rise  to  more  complicated  systems.  Typically  a  control  model 
minimizes  a  function  of  state  vectors  x^  and  control  vectors  u^,  sub¬ 
ject  to  dynamic  equations, 


r  v  A(D  4.  n(1>  4.  u(D 

Vi+l  =  Ai  xi  +  Di  Ui  +  bi 


1... 
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and  control  constraints 


0 


D<2>  +  b 


(2) 


i  -  1,...,  t+1 


Together,  these  equations  form  a  non-square  staircase  system;  again, 
if  the  minimization  is  carried  out  by  some  simplex  method,  there  are 
square  staircase  systems  to  be  solved  at  each  iteration.  When  the 
resulting  staircase  matrices  are  put  in  reduced  form,  the  linking  rows 
correspond  to  dynamic  equations,  and  the  non-linking  rows  to  control 
constraints;  the  linking  columns  correspond  to  state  variables,  and 
the  non-linking  columns  to  control  variables. 


Higher-order  staircases 

A  more  general  approach  says  that  a  staircase  matrix  is  of  order 
if  =  0  except  for  i  =  j,  j+1,  ...,  j+r.  The  preceding  subsections 

thus  characterized  matrices,  and  systems,  of  order  one.  Higher-order 
staircase  systems  are  not  uncommon  in  complex  applications  (for  example, 
modeling  energy  systems  [39]).  They  arise  also  from  control  models 
that  have  rth-order  dynamic  equations. 

This  paper  is  predominantly  concerned  with  first-order  staircase 
systems:  these  have  the  most  specialized  structure  and,  consequently, 
are  most  amenable  to  special  techniques.  Still,  some  techniques  are 
applicable  to  higher-order  staircases  as  well,  with  appropriate  modifica¬ 
tions  that  will  be  pointed  out  as  the  exposition  proceeds.  For  brevity, 
however,  the  adjective  "first-order"  will  usually  be  dropped. 


Higher  order  systems  can  also  be  made  into  first-order  ones,  in 
either  of  two  ways.  First,  an  rth-order  system  can  be  transformed  to 
a  first-order  one  by  adding  certain  variables  and  equations,  yielding  a 
larger  system  that  has  the  same  number  of  periods.  Second,  every  r 
periods  of  an  rth-order  system  may  simply  be  aggregated  as  one  period, 
resulting  in  a  first-order  system  of  the  same  size  having  only  about 
t/r  periods.  The  first  method  is  most  practical  when  the  system  has 
only  a  few  nonzeroes  below  the  staircase  to  begin  with,  while  the  second 
may  be  feasible  when  the  number  of  periods  is  large  relative  to  r. 


"Balanced"  staircases 

As  defined,  each  period  of  a  staircase  matrix  may  have  any  number 
of  rows  and  columns,  subject  to  £  m^  *  £ n^  *  m;  and,  of  these,  any 
number  may  be  linking  rows  and  columns,  subject  to  m^  and  n^  n j  • 

But  if  Bx  =  b  is  to  have  a  solution  for  arbitrary  b,  then  B  must  be 
a  nonsingular  staircase  matrix.  Nonsingularity  imposes  much  stronger 
constraints  upon  the  shape  of  the  staircase.  Essentially,  a  nonsingular 
staircase  matrix  must  be  well  "balanced":  there  cannot  be  too  great  a 
difference  between  the  numbers  of  rows  and  columns,  either  in  individual 
periods  or  cumulatively. 

The  simplest  balance  relations  are  derived  as  follows.  Given 
that  B  is  nonsingular,  the  colutms  in  any  period  k  must  be  independent 
But  the  n^  columns  of  period  k  have  nonzero  elements  only  in  blocks 
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and  Bk+1  k,  and  so  they  can  be  Independent  only  if  n^  • 

special  consideration  of  the  last  period,  this  shows  that 


nk  i  \  +  "Wl 


k  •=  1 .  t-1 


The  rows  of  a  nonsingular  matrix  must  also  be  independent.  Hence,  by 
similar  argument. 


m^  <_  n^ 

\  ^  Vi  +  nk  •  k 

Putting  these  relations  together,  one  gets 


0  <_  n^  -  <_ 


-nk-i  ^nk  -  \  <Vi  ’  k  ’  2 .  t_1 


”nt-l  <  nt  -  rat  < 0 


The  difference  n^  -  m^  is  the  imbalance  between  rows  and  columns  of 
period  k.  Taken  together,  these  inequalities  tend  to  rule  out  a  very 
large  imbalance  in  any  period. 

A  stronger  set  of  inequalities  can  be  derived  by  also  consider¬ 
ing  the  numbers  of  linking  rows  and  columns  in  the  reduced  form  of  B. 
For  example,  the  colunns  of  period  k  really  have  nonzeroes  only  in  the 
m^  +  mjc+j  rows  of  blocks  and  Bk+1  k'  ^*us 


With 
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nk  -  \  +  “k+l  * 


nt^ 


in 


k  *  1, . . . ,  t-1 

i 


Similar  reasoning  applies  to  the  rows  of  each  period  to  give 
£  n^ 

“k  -  nk-l  +  nk*  k  =  2 .  t 


Further,  within  each  period  the  n^  -  n^  non-linking  columns  must  be 
independent.  These  colunns  have  nonzeroes  only  within  Bj.  ,  and  so 


nk  “  nk  -  “k  ’ 


k  =  1 , . . . ,  t-1 


Analogously,  the  non-linking  rows  in  period  k  must  be  independent, 

so  that 


2,. 


t 


Combining  these  four  sets  of  relationships. 


0  <_  n^  -  <_  mining, n^) 

A  ^  A  A 

-min(m^,n^_^)  <_  n^  -  m^  £  min n^)  >  k  *  2 , . . . ,  t-1 

-min(rat>nt  £  n^  -  m£  <  0 
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In  brief,  the  imbalance  of  period  k  is  bounded  by  the  smaller  dimension 


of  B,  .  and  the  smaller  dimension  of  B.  ,  .  If  B,  ,  ,  is  small 
k,k-l  k+l,k  k+l,k 

for  all  k — so  that  <<  m^  or  n^  <<  n^ — then  these  bounds  are 
especially  severe.  Hence  "weakly  linked"  staircases  must  be  nearly 
balanced. 

All  of  the  above  analysis  applies  equally  well  to  any  series  of 

periods  k,  k+1 . 1:  the  rows  or  columns  of  periods  k  through  l  must 

be  independent,  and  so  various  inequalities  must  hold.  As  a  consequence, 
the  constraints  derived  above  for  one-period  balance  can  be  generalized 
to  cumulative  balances  over  periods  k  through  Jt : 


Oil* 

(vV  i 

minU^i.np  , 

i  =  i,.., 

■,  t-1 

(1) 

-min(Vnk-l)  ^  £k 

m^n *n^)  * 

k,£  =  2,. . , 

t-1 

(2) 

-min(m^,nk_1)  <  ^ 

(ni_mi)  — 

0  , 

k  =  2,... 

■.  t 

(3) 

Of  course,  ^  (n^-n^)  =  0  since  the  matrix  B  is  square.  If  constraints 
(1)  above  are  subtracted  from  this  equality,  they  yield  constraints  (3),  and 
vice-versa;  hence  (1)  and  (3)  are  equivalent. 

The  constraints  for  cumulative  balance  are  quite  strong, 
especially  when  the  number  of  periods  t  is  fairly  large.  Notably,  they 
imply  that  the  imbalance  n^  -  m^  cannot  stay  positive  or  negative  for 
too  many  consecutive  periods;  and,  the  greater  the  imbalance  is  on 
average,  the  more  often  it  must  fluctuate  between  positive  and  negative. 


Square  sub-staircases 

It  is  revealing  to  rewrite  the  inequalities  (1)  in  the  following 

way : 

- 0  "  ^ini  -  ^imi  (4) 

i-ilni  <iX  <5> 

£l(ni"mi)  <  «  Ijn1  <  ^  (6) 

In  words,  the  first  l  periods  of  the  lower  staircase  cannot  have  more 
rows  than  columns,  while  the  first  l  periods  of  either  associated 
upper  staircase  (as  defined  above)  cannot  have  more  columns  than 
rows. 

The  above  relations  are  equalities  when  l  »  t,  since  the  matrix 
is  square.  It  can  also  happen  that  equality  is  achieved  for  some  i,  <  t. 

r-» 

For  example,  if  =  t*le  must  look  something  like  this: 


- 1 

1 

1 

1 

i 

—I 

1 

1 

1 - 

1 

~l 

1 _ 

1 

• 

J 

l 


3 

1 


m 


i 


- 1 


3 

1 


n 


i 
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The  rows  and  colunns  of  periods  1  through  £  form  a  square  sub-staircase , 
as  do  the  rows  and  colurais  of  periods  £+1  through  t;  they  are  linked 

only  by  nonzero  elements  in  the  off-diagonal  block  i'  3  similar 

yl  yl  i 

way,  an  equality  /,^n^  =  Z.^m  implies  a  pair  of  square  sub-staircases 

r£  i  r£ 

within  the  row-upper  staircase  form,  and  ^n  =  £  m  implies  the  same 
for  the  column-upper  form. 

In  general,  a  single  matrix  may  exhibit  any  or  all  of  these 
three  kinds  of  equalities,  and  they  may  hold  for  several  values  of 
£  <  t.  If  p  such  equalities  hold,  then  the  matrix  decomposes  into 
p+1  disjoint  irreducible  square  sub-staircases  of  various  kinds.  (Some 
of  these  sub-staircases  may  be  null,  however,  if  there  are  no  linking 
rows  or  columns  in  certain  periods.) 

Block- triangular  reduction  of  staircase  matrices 

Square  sub-staircases  have  a  natural  relationship  to  the  block- 

triangular  reduction  of  a  staircase  matrix.  In  fact,  if  the  staircase 

is  dense — that  is,  if  there  are  no  zeroes  in  the  blocks  B,  ,  and 
-  kk 

**k+l  k — t*ien  eacb  irreducible  square  sub-staircase  coincides  with  one 
of  the  blocks  of  the  reduction.  A  weaker  corollary  of  this  result  applies 
to  the  sparser  staircases  commonly  encountered  in  applications  (especially 
linear  programming) :  each  square  sub-staircase  comprises  one  or  more 
blocks  of  the  reduction. 

The  remainder  of  this  section  outlines  proofs  of  the  above 
assertions.  The  main  proof  is  in  two  parts  that  correspond  to  the  two 
steps  of  the  standard  method  for  block-triangular  reduction:  Proposition  1 


demonstrates  a  transversal  (a  nonzero  diagonal)  within  the  dense 
staircase,  and  Proposition  2  then  shows  that  the  strong  components  of  a 
dense  staircase  are  exactly  the  square  sub-staircases.  Proposition  3 
applies  as  much  of  the  foregoing  as  possible  to  staircases  that  are 
not  dense. 

PROPOSITION  1.  A  dense  nonsingular  staircase  matrix,  in  reduced  standard 
form,  has  a  nonzero  diagonal. 

Proof .  Consider  any  column  j  of  period  l.  By  definition. 


ri-1 

Il  nt  <  j  < 


(7) 


But  nonsingularity  implies,  by  (4)  above,  that  ^  ^m^  ^n^.  Hence 


l 


l-l 


<  j 


and  consequently  row  j  must  also  be  of  period  £  or  later.  There  are 
now  two  cases  to  consider: 

(a)  Row  j  is  of  period  l .  The  jth  diagonal  element  is  at  the 
intersection  of  a  row  and  a  column  of  period  H ;  hence  it  is  an  element 
of  B  ,  and  must  be  nonzero. 

(b)  Row  j  is  from  a  period  after  l.  With  (7),  this  implies  that 

IX  <  J  i  IX 
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But  nonsingularity  implies,  by  (5)  and  (6)  above,  that  )_^n*  and 

V i  ,  Vi  i  „ 

L  _<  Z  -j^tn  .  Hence 


VU  .  ,  .  rU 
Z:n  <  J  1  Zj® 


which  means  that  j  must  be  a  linking  coluim  of  period  l  and  a  linking 
row  of  period  £+1 .  The  j  th  diagonal  element  is  thus  in  B^^  and 
must  be  nonzero.  ■ 

PROPOSITION  2.  Let  B  be  a  dense  nonsingular  staircase  matrix,  in  reduced 
standard  form.  Then  the  diagonal  blocks  of  B's  block-triangular  reduction 
are  exactly  the  irreducible  square  sub-staircases  of  B. 

Outline  of  proof:  In  light  of  Proposition  1,  it  suffices  to  show  that 
the  strong  components  of  B  correspond  to  the  irreducible  square  sub¬ 
staircases.  Since  strong  components  are  defined  by  a  symmetric  permutation, 
any  strong  component  may  be  identified  by  its  diagonal  e-lements. 

Proposition  1  implies  that  every  element  of  the  diagonal  must 
fall  into  one  of  the  following  classes: 

diagonal  elements  in  B 
E  diagonal  elements  in  B 

X*  jC  »  J.  j  X/ 

It  is  straightforward  to  verify  that,  if  class  D{  is  not  null,  all  of 
its  members  are  in  the  same  strong  component.  The  same  is  true  of  E^  . 

These  classes  have  a  natural  ordering — ,  E^,  D£,  E2 , . . .  , 

Dt_^,  such  that  the  following  is  also  straightforwardly  shown: 

if  two  classes  are  in  the  same  strong  component,  then  so  are  all  classes 
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between  them  in  the  ordering.  It  follows  that  each  strong  component 

comprises  one  chain  of  adjacent  classes:  for  example,  the  first  strong 

component  might  contain  D.  through  D  ,  the  second  E  through  D, , 

I  J  3  6 

and  so  forth.  To  specify  the  strong  components,  then,  it  suffices  to 
say  where  each  chain  ends — or  equivalently,  it  suffices  to  find  every 
place  where  two  adjacent  classes  are  not  in  the  same  strong  component. 

When  are  adjacent  classes  in  different  strong  components? 

It  is  again  straightforward  to  show  that  this  can  happen  only  in  three 
cases : 


(a) 

contains 

no  element 

on  any 

per iod-t 

linking  row 

(b) 

contains 

no  element 

on  any 

period-i 

linking  column 

(c) 

E£ 

is  null 

and  it  is  easily  seen  that  these  cases  are  equivalent,  respectively,  to 
the  following: 


(A) 

c£-l  i 
Ll  m 

p£-l 
=  "x 

(B) 

}  ,m 

L1  i 

rl  i 

=  lx" 

(C) 

pi 

)  ,  m . 

L1  l 

Xs- 

l1ni 

Thus  it  can  be  concluded  that  the  strong  components  correspond  exactly 
to  the  irreducible  square  sub-staircases.  ■ 

PROPOSITION  3.  Let  B  be  a  nonsingular  staircase  matrix  in  reduced 
standard  form.  Then  each  diagonal  block  of  B's  block-triangular  reduc¬ 
tion  lies  wholly  within  one  of  the  irreducible  square  sub-staircases  of  B. 
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Proof :  Define  a  dense  staircase  matrix  B  by  filling  in  all  zeroes 

within  the  staircase  blocks  of  B.  Clearly  B  has  all  the  same  square 
sub-staircases  as  B,  and  by  Proposition  2,  B's  diagonal  blocks  are 
exactly  its  irreducible  square  sub-staircases. 

Now  consider  deriving  B  from  B  by  substituting  zeroes 
for  certain  nonzeroes  in  B.  It  is  easily  shown  that  substituting  a 
zero  for  a  nonzero  preserves  the  structure  of  the  block-triangular 
reduction;  at  most,  a  new  zero  enables  one  diagonal  block  to  be  split 
into  several.  Thus  the  diagonal  blocks  of  B  must  lie  wholly  within 
the  diagonal  blocks  of  B. 

Putting  the  above  statements  together,  one  has  that  B's 
diagonal  blocks  lie  wholly  within  B's  diagonal  blocks,  which  are 
exactly  B's  square  sub-staircases,  which  are  the  same  as  B's  square 
sub-staircases;  hence  the  Proposition  has  been  proven.  ■ 
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ELIMINATION  OF  STAIRCASE  SYSTEMS 


Test  problems 

Staircase  systems  for  the  computational  tests  in  this  paper 
were  derived  from  a  variety  of  staircase-structured  linear  programs 
(LPs) .  Test  runs  employed  versions  of  the  MINOS  LP  code  [38,47] 
specially  modified  to  incorporate  different  sparse  elimination  techniques. 
As  explained  in  Section  1,  MINOS  merely  updated  the  LU  factorization 
at  most  iterations;  a  fresh  "refactorization"  was  performed  only  occasion¬ 
ally,  typically  10-20  times  in  a  test  run.  At  each  refactorization  one 


or  more  eliminations  were 

performed  and  pertinent 

statistics 

were  collected 

these  statistics  were  the 

raw  dat 

a  for  the  results 

reported 

here . 

The 

following  staircase 

LPs  were  used  in 

one  or  more  tests: 

NAME  OF 

NONZERO 

FIRST- 

TEST  LP 

PERIODS 

ROWS 

COLUMNS 

ELEMENTS 

ORDER? 

GR0W1 5 

15 

301 

645 

5666 

Yes 

GROW22 

22 

441 

946 

8319 

Yes 

SCSD8 

39 

398 

2750 

11349 

Yes 

SCAGR25 

25 

472 

500 

2208 

Yes 

SCRS8 

16 

491 

1169 

4106 

Yes 

SCFXM2 

8 

661 

914 

5466 

Yes 

SCTAP2 

10 

1101 

1880 

13815 

Yes 

PILOT 

9 

723 

2789 

9291 

No 

BP1 

7 

822 

1571 

11414 

No 
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All  but  two  are  first-order  staircase  LPs.  and  PILOT  is  nearly  first- 
order  (there  are  less  than  100  elements  outside  the  main  staircase). 

BP1,  by  contrast,  has  only  a  weak,  high-order  staircase  form — it  was 
not  originally  formulated  as  staircase — and  it  was  included  to  test 
the  robustness  of  some  of  the  methods.  GR0W15  and  GROW22  are  essentially 
twins  except  in  number  of  periods;  otherwise  all  of  these  LPs  differ  con¬ 
siderably  in  structure  and  proportion,  as  Appendix  A  shows  in  more  detail. 
Staircase  bases  in  the  test  runs  thus  also  differed  greatly  from  LP  to 
LP;  moreover,  for  some  LPs  the  basis  structure  changed  noticeably  from 
refactorization  to  re  factorization.  Thus  the  LP  bases  afforded  a  well- 
varied  set  of  test  matrices  for  elimination. 

Three  separate  groups  of  test  runs  are  referred  to  in  the 
following  sections.  Each  test  group  employed  a  subset  of  the  LPs 
listed  above,  and  produced  a  particular  set  of  statistics. 

The  first  group  of  tests  was  designed  to  yield  data  on 
structural  techniques  for  sparse  elimination.  At  each  refactorization 
MINOS  performed  two  eliminations,  one  by  a  standard  method  and  one  by 
the  staircase  method  developed  in  Section  5,  and  reported  extensive 
statistics  on  both  eliminations  and  on  the  basis  structure.  These 
statistics  were  averaged  to  produce  the  numbers  reported  below.  Where 
the  statistics  were  percentages — percent  fill-in,  for  example — individual 
percentage  figures  were  computed  for  each  elimination,  and  these  figures 
were  then  averaged.  (Reported  percentages  are  thus  not  based  on  the 
averaged  data,  but  are  essentially  weighted  percentages  of  the  original 
data . ) 
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The  first  test  group  used  all  of  the  above  LPs  except 
SCTAP2,  whose  bases  are  consistently  almost  triangular.  Runs  were 
started  from  an  all-slack  basis  except  for  PILOT  and  BP1;  consequently, 
some  early  bases  had  an  uncharacteristically  high  proportion  of  unit 
vectors  and  were  disregarded.  The  remaining  bases  were  classified  accord¬ 
ing  to  the  size  of  the  largest  bumps  in  their  block-triangular  reduction. 
For  SCAGR25,  SCRS8,  SCFXM2,  PILOT  and  BP1  the  bases  were  divided  into 
two  classes,  one  of  larger  and  one  of  smaller  bumps;  these  classes  are 
designated  SCAGR25+  and  SCAGR25-,  SCRS8+  and  SCRS8-,  and  so  forth,  and 
separate  averages  are  reported  for  each.  The  bases  of  SCSD8  were  simi¬ 
larly  divided  into  three  classes,  denoted  SCSD8-  (small  bumps),  SCSD8+ 
(larger  bumps),  and  SCSD&H-  (largest  bumps).  The  bases  of  GR0W15  and 
GROW22  had  bumps  of  a  fairly  uniform  size  and  so  were  not  classified. 

The  second  group  of  tests  was  similar  to  the  first,  but  yielded 
data  on  both  structural  and  local -min  techniques  for  sparse  elimination. 
Five  eliminations  were  performed  at  each  refactorization:  two  by 
standard  methods,  and  three  by  staircase  methods  of  Sections  5  and  6. 
Statistics  were  collected,  classified,  and  averaged  as  for  the  first 
test  group. 

Tests  in  the  second  group  used  GR0W15,  SCSD8,  SCAGR25,  SCRS8 
and  SCFXM2;  bases  were  divided  into  large-bump  and  small-bump  classes 
except  for  GR0W15.  The  design  of  MINOS  favors  structural  techniques, 
and  the  test  implementation  of  local-minimization  techniques  was  fairly 
inefficient;  as  a  consequence  the  second  test  group  comprised  a  smaller 
set  of  bases  than  the  first,  and  the  data  generally  have  a  greater 
variance  about  the  averages. 
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A  third  group  of  tests  measured  the  amount  of  time  spent  in 
each  LP  subroutine.  This  group  used  all  of  the  LPs  except  GR0W15  and 
GROW22;  each  LP  was  timed  twice,  once  using  the  standard  structural 
technique  for  sparse  elimination  and  once  using  the  staircase  structural 
technique  of  Section  5.  (Other  results  from  these  timing  runs  are 
reported  in  [22].) 

-4 

All  runs  employed  a  relative  pivot  tolerance  6  ■*  10  .  This 

is  a  smaller  6  than  is  usually  used,  but  it  gave  adequate  results  in 
virtually  every  case.  (Only  four  bases — one  each  from  SCAGR25,  SCRS8, 
SCFXM2  and  PILOT — yielded  unacceptable  factorizations,  both  from  struc¬ 
tural  techniques;  MINOS  automatically  refactorized  each  of  them  with 
a  higher  6.)  A  smaller  6  results  in  fewer  rejected  pivots,  so  that 
generally  less  time  is  spent  in  factorization. 

Further  details  of  all  three  groups  of  tests  are  summarized 
in  Appendix  B. 


Standard  structural  techniques 

Bump- and- spike  analysis  has  been  the  standard  structural 
technique  for  sparse  elimination  in  large-scale  linear  programming  codes. 
As  such,  it  has  been  applied  successfully  to  a  great  variety  of  structured 
linear  systems.  How,  then,  does  it  perform  on  staircase  structures? 
Experimental  results,  summarized  in  Appendix  C  and  in  Section  5  below, 
seem  to  indicate  that  bump-and-spike  analysis  may  do  well  but  sometimes 
does  quite  poorly. 
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The  critical  factor  is  clearly  the  size  of  the  bumps  in  the 
block-triangular  reduction.  When  the  reduction  yields  many  small  bumps, 
the  elimination  proceeds  quickly  and  fill-in  is  low.  When  there  are  fewer 
and  larger  bumps,  however,  performance  degrades  markedly:  fill-in  is 
greater,  running  time  of  the  P3  heuristic  increases  steeply,  and  there 
tend  to  be  more  spike  swaps  due  to  unsatisfactory  pivots.  The  situation 
is  especially  bad  when  half  or  more  of  the  basis  columns  are  in  one  or 
a  few  large  bumps. 

Such  behavior  is  predictable:  bump-and-spike  techniques  are 
predicated  upon  a  block-triangular  structure  having  many  small  bumps,  and 
they  cannot  be  expected  to  work  as  well  when  this  desired  structure  is 
absent.  The  important  thing  is  that — as  suggested  in  Section  2 — staircase 
systems  may  be  prone  to  large  bumps  when  there  are  few  square  sub-stair¬ 
cases.  Indeed,  the  data  for  SCSD8  show  a  clear  correlation  between 
bump  size  and  number  of  lower  sub-staircases.  (No  clear  relationship 
appears  for  the  other  examples,  but  it  must  be  kept  in  mind  that  a  system 
may  have  alternate  staircase  forms  that  go  unrecognized.) 

In  sum, staircase  systems  seem  especially  likely  to  have  large 
bumps.  When  they  do,  standard  bump-and-spike  techniques  lose  their 
attractiveness,  and  it  is  worth  looking  for  alternatives. 

Bump-and-spike  analysis  also  suffers  from  ignorance  of  the 
staircase  structure.  It  does  find  the  square  sub-staircases,  since  they 
each  comprise  one  or  more  bumps.  But,  within  a  square  sub-staircase,  it 
tends  to  scramble  the  rows  and  columns  without  regard  to  period.  Con¬ 
sequently  there  is  little  structure  to  L  or  U  that  might  be  exploited 
to  save  time  in  back-substitution.  Experiments  in  [22]  have  shown  that 
such  savings  can  be  significant. 
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Standard  local-minimization  techniques 

Appendix  D  and  Section  6  below  show  statistics  from  a  standard 

local-minimization  technique  applied  to  elimination  of  staircase 

systems.  Results  seem  encouraging:  fill-in  is  equal  to  or  less  than 

that  for  bump- and- spike  techniques,  and  is  much  less  sensitive  to  bump 

size.  Moreover,  there  is  no  reason  to  expect  a  gross  increase  in  running 

time  when  a  few  bumps  are  large.  Thus  local -minimization  techniques  appear 

somewhat  preferable  to  bump-and-spike  structural  techniques  in  dealing 

with  large-bump  staircase  bases. 

Local  minimization  is  at  a  disadvantage  in  handling  large 

systems,  however,  since  it  must  keep  track  of  the  uneliminated  matrix 
(k) 

£5  at  every  step.  Staircase  systems  in  particular  are  often  large 
ones,  since  they  are  generally  built  from  one-period  models  that  are  not 
small  to  begin  with.  Thus  local-minimization  techniques  may  involve  un¬ 
desirable  amounts  of  extra  storage  or  storage-management  overhead. 

Local  minimization  also  suffers,  like  bump-and-spike 
analysis,  from  ignorance  of  the  staircase  structure.  It  generally 
scrambles  all  rows  and  columns  without  regard  to  periods,  and  does  not 
even  preserve  square  sub-staircases.  Hence  none  of  the  staircase  structure 
is  passed  on  to  L  and  U. 
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4.  NATURAL  PIVOT  ORDERS  FOR  STAIRCASE  SYSTEMS 


Motivation 

Many  kinds  of  structured  systems  have  "natural"  pivot  orders 
that  can  be  proved  to  yield  no  fill-in  or  limited  fill-in.  Staircase 
systems  are  no  exception.  Described  below  are  two  natural  pivot  orders 
for  staircase  matrices:  one  involves  a  dynamic  ordering  of  rows  and 
columns,  while  the  other  preorders  the  rows  and  dynamically  orders  the 
columns  only.  These  natural  orders  motivate  the  staircase  pivot-selec¬ 
tion  methods  of  Sections  5  and  6. 

In  describing  pivot  orders  it  is  convenient  to  first  consider 
a  dense  staircase  matrix  that  is  entirely  nonzero  within  the  blocks 

and  Bk+1  ^  (or  k  reduced  form) .  Strong  limits  to 

fill-in  can  be  proved  for  dense  staircases;  these  limits  are  then  readily 
generalized  to  sparser  staircases,  which  are  the  ones  commonly  encountered 
in  applications  (especially  in  linear  programming) . 

Both  of  the  pivot  orders  described  here  are  prefigured  in 
earlier  work.  A  similar  dynamic  ordering  was  derived  by  Varah  [56] 
for  certain  kinds  of  staircase  systems.  Saigal  [44]  and  Propoi  and 
Krivonozhko  [41]  independently  developed  a  method  for  staircase  LPs 
that  preorders  the  rows  and  dynamically  orders  the  columns;  although 
they  did  not  think  in  terms  of  elimination,  their  method  must  lead  to 
a  pivot  order  like  the  second  one  below.  All  of  these  authors,  however, 
assume  that  sparsity  within  the  staircase  blocks  can  be  ignored,  and 
their  methods  do  not  extend  directly  to  sparse  staircases. 
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Natural  pivot  orders  with  dynamic  selection 

Consider  first  a  dense  staircase  matrix  in  reduced  standard  form, 
and  allow  its  rows  and  columns  to  be  rearranged  in  any  way  to  avoid  un¬ 
desirable  pivot  elements.  As  explained  in  Section  1,  a  nonzero  element 
(k) 

of  8  may  be  accepted  as  the  kth  pivot  if  it  is  at  least  6  times 

(k) 

the  largest  element  in  its  colunei  of  8  ,  or  if  it  is  at  least  6 

(k) 

times  the  largest  element  in  its  row  of  8 

In  this  case  there  need  be  no  fill-in  at  all: 

PROPOSITION  4.  If  B  is  a  nonsingular  dense  staircase  matrix  in  reduced 
standard  form,  then  there  is  an  acceptable  pivot  order  for  B  that  pro¬ 
duces  no  fill-in. 

Proof  is  by  induction  on  the  number  of  periods.  The  one-period  case  is 
trivial.  The  induction  step  considers  a  t-period  case:  the  idea  is  to 
find  k  acceptable  pivots  such  that  (i)  no  fill-in  occurs,  and  (ii)  the 
remaining  uneliminated  matrix,  g^+l),  a  (t-i)-period  staircase  matrix. 
By  the  induction  hypothesis,  pivots  may  then  be  found  within  g^k+l) 
to  complete  the  elimination  without  fill-in. 

The  desired  pivots  are  found  in  three  steps,  as  follows: 

(1)  Consider  the  m^  x  matrix — call  it  — comprising  the 

rows  and  non-linking  columns  of  period  1: 
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B  nonsingular  implies  m^  _>  n  »  and  s0  the  colmms  of  have  full 

rank.  Consequently  an  acceptable  pivot  may  be  found  in  every  colunn  of 
E^.  Carry  out  the  elimination  of  B  with  these  pivots:  clearly  there 
is  no  fill-in,  and  the  staircase  structure  is  preserved.  (If  n1  -  0 
— all  colunns  of  period  1  are  linking — or  m^  =  0,  this  step  can  be 


skipped . ) 


/  1  ,  .  V 

(2)  The  remaining  uneliminated  matrix,  6  n  ,  looks  like 


>!+!) 


r 


fj(n  is  the  same  as  B  except  for  the  x(n^-n^)  matrix — 

call  it  E2 — comprising  the  remaining  row9  and  columns  of  period  1. 
Nonsingularity  of  B  implies  2  m^»  hence  (n^-n^)  2  (m^-n^); 
since  6  must  also  be  non-singular,  it  follows  that  the  rows 

of  must  have  full  rank.  Thus  an  acceptable  pivot  may  be  found 

in  every  row  of  E^.  Again,  carry  out  the  elimination  on  B  with 
these  pivots:  there  is  no  fill-in,  and  the  staircase  structure  is 
preserved.  (If  n^  =  n^ — period  1  has  no  linking  columns — or  =  n\ 

this  step  may  be  skipped.) 

(mj+1) 

(3)  The  remaining  uneliminated  matrix, 3  ,  looks  like  this: 


! 


B<“1+1> 


l. 


(yD  1 

B  is  the  same  as  B  except  for  the  (m  -m^)  x  (n^-m^)  matrix — 

call  it  E^ — comprising  the  remaining  columns  of  period  1  and  the  linking 

rows  of  period  2.  Nonsingularity  of  B  implies  m'*'  2  ni  >  hence 
^  (ra^+1) 

(m  -m^)  2  (",-m  ) ;  since  B  must  also  be  nonsingular,  it  follows 

that  the  columns  of  E^  must  have  full  rank.  Thus  an  acceptable  pivot 
may  be  found  in  every  column  of  E^-  Eliminating  B  with  these  pivots, 
there  is  still  no  fill-in,  and  again  the  staircase  structure  is  preserved. 
(This  step  may  be  skipped  if  «  m^ — period  2  has  no  linking  rows — 
or  m^  •  n j . ) 


I* 
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AC  Chis  point  all  rows  and  columns  of  period  1,  and  all  but 
m^  +  n>2  -  n^  rows  of  period  2,  have  been  eliminated.  The  remaining 
matrix,  0  (nl+D  ^  has  this  form: 


B(nl+1) 


(n^l) 

6  is  the  same  as  B,  except  that  B_,~  has  been  reduced  to 

(Oj+1) 

(m^  +  mj  -  n^)  xn^.  Clearly,  then  0  is  itself  a  nonsingular 

dense  staircase  matrix  in  reduced  standard  form,  but  with  t-1  periods. 
Thus  the  proof  is  completed  by  induction,  as  intended.  ■ 

The  proof  above  demonstrates  a  natural  pivot  order:  essentially 
it  pivots  in  the  columns  of  each  period  before  those  of  the  next,  and 
pivots  in  the  rows  of  each  period  before  those  of  the  next.  Pivots  with 
in  each  period  are  chosen  in  an  order  that  avoids  all  fill-in  i t  the 
staircase  is  dense. 

When  the  staircase  is  not  fully  dense,  some  fill-in  is  generally 
unavoidable.  However,  an  inspection  of  the  proof  of  Proposition  4  shows 
that  the  natural  pivot  order  exists  whether  the  staircase  is  dense  or 
not;  the  only  difference  in  the  non-dense  case  is  that  some  elements 
within  the  staircase  may  fill  in.  Hence  the  following  corollary: 
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PROPOSITION  5.  If  B  is  a  nonsingular  staircase  matrix  in  reduced 


standard  form,  then  there  is  an  acceptable  pivot  order  for  B  that 
produces  no  fill-in  outside  blocks  and  B^+^ 

In  short,  there  is  a  natural  pivot  order  that  confines  all  fill- 
in  to  within  the  staircase.  It  seems  reasonable  to  expect  that  such  a 
pivot  order  will  be  a  good  one.  However,  it  is  entirely  possible  that 
there  are  other,  sparser  pivot  orders  that  do  not  confine  fill-in  to 
the  staircase.  (Indeed,  most  staircase  matrices  have  several  reduced 
standard  staircase  forms;  even  if  the  sparsest  pivot  order  confines 
fill-in  to  one  of  these  staircases,  it  may  well  not  confine  fill-in  to 
the  others.) 

The  natural  pivot  order  for  a  given  staircase  matrix  is  hardly 
unique.  In  practice  there  are  usually  many  acceptable  natural  pivot 
orders,  some  of  them  considerably  better  than  others,  and  a  further 
method  is  needed  to  choose  between  them.  Use  of  a  local-minimization 
technique  for  this  purpose  is  the  topic  of  Section  6, 

Natural  pivot  orders  with  preselection  of  rows 

Some  kind  of  bump-and-spike  structural  technique  might  also  be 
used  to  choose  a  natural  staircase  pivot  order.  But  techniques  of  this 
sort  allow  only  colmun  swaps  to  avoid  undesirable  pivots,  whereas  both 
row  and  column  rearrangements  are  generally  needed  to  find  a  pivot  order 
in  the  sense  of  Proposition  5.  Hence  bump-and-spike  techniques  require 
a  different  variety  of  natural  pivot  order — one  that  can  be  found  through 
rearrangement  of  columns  only. 


SO 


Again,  consider  first  a  dense  staircase.  Fix  the  row  order 
so  that  the  rows  are  pivoted,  as  in  Proposition  4,  by  period;  all  rows 
in  period  1  are  pivoted  first,  then  all  rows  in  period  2,  and  so  forth 
through  period  t.  Then  fill-in  is  limited  as  follows: 

PROPOSITION  6.  Let  B  be  a  nonsingular  dense  staircase  matrix,  and 
fix  in  advance  any  pivot  order  by  period  of  the  rows  of  B.  Then  there 
is  a  corresponding  acceptable  column  pivot  order  for  B  such  that 

v*  t 

(i)  Fill-in  occurs  in  as  few  as  1^2.  ~  rank  B^) 

different  columns. 

(ii)  The  number  of  elements  that  fill  in  is  at  most 

K=2  [nWi  (ni  -  V1- 

Proof  is  in  two  parts.  The  first  part  demonstrates  a  class  of  pivot 
orders  that  confine  fill-in  to  a  restricted  set  of  columns.  The  second 
part  shows  how  small  this  set  can  be  made. 

(1)  The  rows  of  period  1  intersect  only  the  columns  of  period  1. 
Consequently,  a  column  of  period  1  can  be  found  to  pivot  acceptably  in 
each  row  of  period  1.  These  pivots  produce  no  fill-in. 

(m.j+1) 

The  remaining  uneliminated  matrix,  B  ,  looks  like  this: 
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The  rows  of  period  2  now  intersect  only  the  leftover  columns  of  period  1 
and  the  columns  of  period  2.  Consequently,  a  column  of  period  1  or  2  can 
be  found  to  pivot  acceptably  in  each  row  of  period  2.  In  the  worst  case, 
each  leftover  period-1  column  fills  in  along  the  rows  of  period  3; 
there  is  no  fill-in  elsewhere. 

It  is  readily  seen  that  the  situation  of  period  2  repeats  itself 


(m.  +• 


,+D 


"1  '  "  *  +mk_i ' 

at  each  period  k.  In  the  remaining  matrix,  6  ,  the  rows  of 

period  k  intersect  only  the  leftover  columns  from  periods  1,  ...  ,  k-1, 
and  the  columns  of  period  k.  (The  columns  of  periods  1,  ...  ,  k-2  did  not 
originally  intersect  the  period-k  rows,  but  they  have  since  filled  in 
along  these  rows.)  Consequently,  a  column  from  periods  1,  ...  ,  k  can 
be  found  to  pivot  on  each  row  of  period  k.  In  the  worst  case,  each 
leftover  column  from  a  period  before  k  fills  in  along  the  rows  of 
period  k+1  (except  when  k  =  t),  but  there  is  no  fill-in  elsewhere. 

Any  pivot  sequence  chosen  in  this  way  will  maintain  the  fixed  row 
order,  and  will  confine  fill-in  to  the  "leftover"  columns.  Moreover, 
after  the  rows  of  period  k-1  have  been  pivoted,  the  number  of  leftover 
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columns  is  just  the  cumulative  excess  of  columns  over  rows,  or 
v  k—  1 

(n^  -  m^) .  Since  these  columns  fill  in  on  the  rows  of  period  k+1, 
the  worst-case  total  fill-in  is  1^=2  ^“^+1  ^i=i  ^ni  " 

(2)  Although  the  number  of  leftover  columns  after  period  k-1 
is  fixed,  the  total  of  different  leftover  columns  can  vary  widely  from 
one  pivot  choice  to  another.  It  is  desirable  to  minimize  this  total, 
so  that  fill-in  is  confined  to  as  few  columns  as  possible.  An  exact 
minimum  cannot  be  demonstrated  if  an  acceptable  pivot  order  is  required, 
but  it  is  possible  to  find  a  lower  bound  that  should  be  very  close  to 
minimum  when  B  is  well-conditioned. 

To  this  end,  leftover  columns  may  be  characterized  as  follows: 
they  are  the  columns  from  any  period  that  pivot  on  rows  of  a  later 
period.  Thus  minimizing  the  total  of  different  leftover  columns  is 
equivalent  to  maximizing,  for  each  k,  the  number  of  columns  of  period  k 
that  pivot  on  rows  of  period  k. 

(mi+*  •  •+mjc_1+l) 

Consider,  then,  the  matrix  6  that  remains 

after  rows  of  periods  1,  ...  ,  k-1  have  been  eliminated: 


g(mi+. . .+mk_i+l) 


The  rows  of  period  k  must  pivot  on  columns  of  the  submatrix  [E  B 
where  E  is  what  remains  of  leftover  columns  from  previous  periods,  and 
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is  the  (as  yet  unaltered)  diagonal  block  from  period  k  of  the 
staircase.  Exactly  m^  independent  columns  of  [E  B^]  must  be 
chosen  to  pivot  on  the  period-k  rows.  The  maximum  number  of  independent 
columns  from  B^*  however,  is  rank  B^*  Hence  at  most  rank 
period-k  columns  can  pivut  on  period-k  rows,  and  it  follows  that  the 
ninimum  number  of  leftover  columns  from  period  k  is  -  rank  B^- 
The  minimum  total  of  different  leftover  columns  is  thus 

Ek-1  (\"rank  Bkk}* 

Proposition  6  gives  a  somewhat  different  natural  pivot  order: 

most  rows  and  columns  are  pivoted  in  period  order — indeed,  most  period-k 

columns  are  pivoted  on  perlod-k  rows — but,  since  ^  n^  _>  m^ ,  some 

leftover  columns  must  be  pivoted  on  later  rows.  If  the  blocks  B.  , 

kk 

are  near  full  rank,  then  fill-in  is  confined  to  approximately 

max(0,  n^  -  m^)  columns;  the  balance  constraints  of  Section  2 
assure  that  tnis  will  not  be  too  large  a  number.  Likewise,  the  total 
fill-in  cannot  be  too  large  because  the  term  (n^  ~  m^)  is  bounded 

by  the  balance  constraints. 

The  proof  of  Proposition  6  suggests  a  procedure  for  finding  a 

natural  pivot  order  of  this  kind.  The  procedure  has  t  stages,  one 

for  each  period;  at  any  stage  the  unpivoted  leftover  columns  are  kept 

in  a  list.  At  stage  k,  find  as  many  independent  columns  of  B^  as 

possible  to  pivot  on  rows  of  period  k.  Choose  leftover  columns  from 

the  list  to  pivot  on  any  remaining  period-k  rows;  add  to  the  list  any 

leftover  unpivoted  columns  from  B, 

kk 
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The  effect  of  this  procedure  is  seen  more  clearly  from  a  permu¬ 
tation  point  of  view.  When  the  columns  are  permuted  to  place  all  pivots 
on  the  diagonal,  the  staircase  is  arranged,  for  example,  as  follows: 


l 


• .  •  «i 

■  •  •  m 
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•  original  non-zeroes 


*  fill-in 


Essentially,  just  enough  columns  are  moved  to  later  periods  so  that  the 
diagonal  blocks  B  ^  are  "squared  off"  and  made  nonsingular.  In  the 
context  of  bump-and-spike  techniques,  the  ®(cjc's  are  bumps,  and 

the  columns  moved  to  later  periods  are  "interperiod  spikes." 

As  previously,  there  is  an  obvious  corollary  for  the  case  in 
which  the  staircase  is  not  fully  dense: 
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PROPOSITION  7:  Let  B  be  a  nonsingular  staircase  matrix,  and  fix 
in  advance  any  pivot  order  by  period  of  the  rows  of  B.  Then  there 
is  a  corresponding  acceptable  column  pivot  order  for  B  such  that 


(i)  Fill-in  occurs  only  within  blocks  B^  and  B^^  ^  except 
v  t 

for  as  few  as  2.^=1  (n^  “  rank  B^)  columns  which  may  fill 
in  within  blocks  j  k+1. 

(ii)  The  fill-in  within  blocks  B  ,  ,  j  >  k+1,  is  at  most 


£2  [\+l  ^i=l  (ni  ~  mi)]  elements- 


rk-1 


Again,  there  are  many  natural  pivot  orders  that  satisfy  the  requirements, 
and  a  practical  method  is  needed  for  choosing  a  good  one.  An  adaptation 
of  the  bump-and-spike  techniques  is  especially  appropriate  for  this 
purpose,  as  explained  in  Section  5. 
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5.  A  STRUCTURAL  TECHNIQUE  FOR  STAIRCASE  ELIMINATION 

Motivation 

The  bump-and-spike  structural  technique  for  pivot  selection, 
although  it  works  well  generally,  is  not  especially  efficient  or  effec¬ 
tive  in  choosing  pivots  for  certain  staircase  matrices.  In  light  of 
the  preceding  discussion,  the  source  of  these  difficulties  is  clear. 

At  the  heart  of  the  problem  are  two  assumptions  of  bump-and-spike 
analysis:  first,  that  the  block-triangular  reduction  of  a  sparse  matrix 

yields  many  small  bumps;  and  second,  that  the  bump-and-spike  structure 
suggests  a  good  pivot  order.  Staircase  matrices  tend  to  violate  both 
of  these  assumptions.  Their  block-triangular  reductions  are  prone  to 
bumps  the  size  of  square  sub-staircases;  and  their  natural  pivot  orders 
(which  pivot  mainly  by  order  of  period)  are  different  from  the  bump-and- 
spike  order  (which  scrambles  periods) . 

These  observations  suggest  that  bump-and-spike  techniques  for 
staircase  matrices  cannot  rely  on  a  good  square-block-triangular 
structure.  An  alternative  structure  is  needed,  and  there  is  one  readily 
available:  the  staircase!  Indeed,  staircase  form  ijs  a  block-triangular 

form,  but  with  non-square  diagonal  blocks. 

Suppose,  then,  that  the  P3  heuristic  is  adapted  to  find  a  spike 
structure  in  the  non-square  diagonal  blocks,  of  the  staircase, 

rather  than  in  the  square  diagonal  blocks  of  the  block-triangular  reduc¬ 
tion.  For  a  lower  staircase,  the  new  P3  would  naturally  be  applied 
first  to  B^,  then  ®22’  an<*  s0  ^ort^  *n  Peri°d  order  through  Btt* 
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Applied  to  B^,  which  has  full  row  rank,  P3  must  find  a  pivot 
element  in  every  row.  There  may  be  some  leftover  columns,  however,  that 
cannot  be  assigned  to  pivot  on  any  period-1  row. 

Applied  to  B22’  must  find  (rank  I^)  pivots.  If  any 

period-2  rows  then  remain  unassigned,  they  may  be  pivoted  on  some  of 
the  leftover  columns  from  period  1.  There  may  also  be  some  leftover 
period-2  columns  that  cannot  be  assigned  to  pivot  on  any  period-2  row. 

The  situation  is  similar  as  P3  is  applied  to  any  block  . 

It  must  be  possible  to  assign  (rank  B  )  pivots.  If  any  period-k  rows 
then  remain  unassigned,  they  may  be  pivoted  on  some  of  the  leftover 
columns  from  previous  periods.  There  may  also  be  some  leftover  period-k 
columns  that  will  have  to  pivot  on  rows  of  later  periods. 

Provided  that  B  is  nonsingular,  this  procedure  can  always 
be  carried  out  through  Bt(_  to  yield  a  complete  assignment  of  pivots: 
nonsingularity  guarantees  that,  for  each  k,  the  rows  of  periods  1  through  k 
must  be  able  to  pivot  on  just  the  columns  of  periods  1  through  k.  The 
result  is  a  bump-and-spike  structure  having  two  kinds  of  spike  columns: 
intraperiod  spikes  of  period  k  that  are  assigned  by  P3  to  pivot  on 
period-k  rows;  and  Interperiod  spikes  of  period  k,  the  leftovers  that  must 
pivot  on  rows  in  later  periods.  Fill-in  is  confined  to  the  spikes,  as 
with  standard  bump-and-spike  techniques. 

Furthermore,  it  should  be  clear  that  the  above  procedure  produces 
a  natural  pivot  order  in  the  sense  of  Proposition  7.  Thus  it  appears 
that  both  disadvantages  of  standard  bump-and-spike  can  be  eliminated: 

P3  need  only  be  applied  to  diagonal  blocks  of  the  staircase — whose  size 
is  manageable — and  the  resulting  pivot  order  is  a  natural  one  for  stair¬ 


case  structures. 


**,  *  •  ' 


The  remainder  of  this  section  shows  how  the  above  outline  can 
be  developed  into  a  practical  technique.  Two  adaptations  of  P3  to  non¬ 
square  matrices  are  presented:  one  applies  the  heuristic  to  the  entire 
matrix,  while  the  other  decomposes  the  matrix  and  applies  P3  to  blocks 
within  it.  Other  matters  that  must  be  considered  are  choice  of  pivot 
rows  for  interperiod  spikes,  and  handling  of  pivots  that  are  zero  or 
too  small  during  elimination.  All  of  these  topics  are  considered  in 
the  context  of  first-order  staircase  systems,  but  the  same  procedures 
apply  virtually  unchanged  to  higher-order  systems. 

Computational  experiments,  reported  at  the  end  of  this  section, 
compare  a  standard  bump-and-spike  algorithm  with  one  version  of  a  stair¬ 
case  adaptation. 

Adaptation  of  P3 

Hellerman  and  Rarick's  original  algorithm  P3  [27]  is  a  heuristic 
for  rearranging  a  nonsingular  matrix  to  have  a  good  bump-and-spike  form. 
Applied  to  the  diagonal  blocks  of  a  block-triangular  reduction,  P3 
necessarily  finds  just  one  bump  per  block.  In  general,  howevei ,  P3  tries 
to  find  numerous  small  bumps,  as  many  of  size  1  as  possible.  Hence  it 
is  appropriate  to  try  to  apply  P3  to  the  diagonal  blocks  of  a  staircase 
— provided  that  the  algorithm  is  adapted  to  handle  matrices  that  are 
not  necessarily  square  or  of  full  rank. 

P3  is  a  preselection  algorithm.  Given  a  square  matrix  A,  it 
chooses  an  ordering  of  the  rows  and  columns  based  only  on  information 
at  hand  before  any  pivots  are  carried  out.  In  particular  it  needs  to 
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know  the  position  of  each  nonzero  element  in  A,  the  number  of  nonzero 
elements  in  each  column  (the  "column  count"),  and  the  number  of  nonzero 
dements  in  each  row  (the  "row  count").  Rows  and  columns  are  chosen 
for  ordering  one  at  a  time.  As  each  is  chosen,  the  relevant  counts  are 
decremented  accordingly;  thus,  at  any  point  in  the  algorithm,  the  counts 
refer  only  to  nonzeroes  in  the  remaining  unchosen  rows  and  columns. 

The  underlying  strategy  of  P3  is  to  create  triangle  columns 
whenever  possible;  spikes  are  created  only  when  there  is  no  other  choice. 
Accordingly,  the  standard  P3  has  three  main  stages: 

(1)  Choose  columns  that  have  a  count  of  1,  and  assign  them  and 
their  pivot  rows  toward  the  end  of  A. 

(2)  Choose  rows  that  have  a  count  of  1,  and  assign  them  and 
their  pivot  columns  toward  the  beginning  of  A. 

(3)  Choose  a  spike  column;  it  will  ultimately  be  the  last  column 
of  a  bump  toward  the  beginning  of  A.  Choose  other  columns 
and  rows  to  fill  out  the  bump,  trying  to  keep  it  small  and 
sparse . 

Stage  1  is  repeated  as  many  times  as  possible,  to  produce  a  set  of 
triangle  columns  at  the  end  of  A: 
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Stage  2  is  then  repeated  In  a  similar  manner  to  produce  a  set  of  triangle 
colums  at  the  beginning  of  A: 


Finally,  stage  3  adds  a  bump  after  the  initial  triangle  colunns: 


Control  now  returns  to  stage  2,  which  adds  more  triangle  columns  after 
the  bump;  then  stage  3  produces  another  bump;  and  stages  2  and  3 
continue  to  alternate  until  no  rows  and  columns  remain: 


(Stages  2  and  3  operate  in  such  a  way  that  there  is  never  occasion  to 
return  to  stage  1.) 

Adapting  P3  to  handle  staircase  blocks  requires  only  one  signifi¬ 
cant  change  to  this  outline:  stage  2  must  accommodate  rows  whose  counts 
fall  to  zero.  The  adapted  stage  2  has  two  parts: 

(2a)  Choose  rows  that  have  a  count  of  0,  and  assign  them  toward 
the  beginning  of  A.  Mark  them  as  unpivoted . 

(2b)  Choose  a  row  that  has  a  count  of  1,  and  assign  it  and  its 
pivot  column  toward  the  beginning  of  A. 

One  pass  through  stage  2  now  proceeds  as  follows:  repeat  2a  until  there 
are  no  zero-count  rows,  then  carry  out  2b  for  any  1-count  row.  As  before, 
the  entire  stage  2  is  repeated  as  many  times  as  possible,  and  alternates 
with  stage  3. 

The  adapted  P3  stops  when  all  rows  have  been  chosen.  It  puts 
A  into  a  bump-and-spike  form  like  the  following: 


sssss 


U:  unassigned  rows 
S:  unassigned  columns  (spikes) 


In  general,  most  rows  and  columns  are  paired  off  to  pivot  within  A. 
However,  some  columns  may  remain  unchosen,  and  some  rows  may  be  marked 
unpivoted. 

It  is  now  easy  to  see  how  the  adapted  P3  finds  a  natural  staircase 
pivot  order.  Applied  to  block  B^,  it  pivots  most  rows  of  the  block  on 
columns  of  the  block;  these  are  the  triangle  columns  and  intraperiod 
spikes  of  period  k.  If  any  unchosen  columns  remain,  they  become  the 
interperiod  spikes  from  period  k.  If  there  are  any  unpivoted  rows,  they 
are  assigned  pivots  on  interperiod  spikes  from  previous  periods. 

When  P3  is  applied  in  this  way  to  blocks  through  B  ,  in 

order,  the  result  is  for  all  practical  purposes  a  natural  pivot  order 
It  should  be  noted,  however,  that  the  number  of  pivots  chosen  by  P3 
within  B  k  may  be  greater  than  the  rank  of  B^*  since  P3  looks  only 
at  the  pattern  of  nonzeroes  within  B^  and  not  at  their  values.  If 
this  is  the  case,  however,  certain  intraperiod  spikes  of  period  k 
will  turn  out  to  have  pivots  of  zero  during  the  elimination;  the  situation 
will  be  remedied  by  swapping  these  spikes  with  later  interperiod  spikes. 
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as  described  later  on  in  this  section.  The  actual  pivot  order,  after 
spike  swapping,  is  indeed  a  natural  pivot  order  in  the  sense  of 
Proposition  7. 


P3  algorithm  for  staircase  blocks 

It  remains  to  state  more  fully  the  adaptation  of  P3  outlined 

above.  For  this  purpose,  R.  and  C.  will  be  the  row  and  column  counts 

i  J 

in  the  remaining  unchosen  columns  of  m  x  n  matrix  A.  The  first  and 
last  remaining  rows  will  be  y^  and  y,  and  the  first  and  last  remaining 
columns  will  be  and  v. 

The  main  routine  for  P3  is  then  as  follows: 


1:  SET  R.  =  number  of  nonzeroes  in  row  i  of  A 

l 

SET  Cj  =  number  of  nonzeroes  in  column  j  of  A 

2:  REPEAT  WHILE  C  =  1  f  r  any  remaining  column  j: 

Find  the  i  for  which  A  4  0 

ij 

Choose  j  as  the  vth  column;  set  v  =v  -  1 
Choose  i  as  the  yth  row;  set  y  =  y  -  1 

Set  C,  =  C  -  1  for  each  remaining  column  k  such  that  A  ,  4  0 
k  k  lk 

3:  REPEAT  WHILE  yQ  y: 

3.1:  REEPAT  WHILE  R^  ]  for  any  remaining  row  i: 

3.1.1:  REPEAT  WHILE  R^  =  0  for  any  remaining  row  i: 

Choose  i  as  the  y^th  row;  jet  yQ  =  yQ  +  1 
Mark  y  to  be  an  unpivoted  row 


3.1.2:  IF  =  1  for  any  remaining  row  i: 

Find  the  j  for  which  A,  .  4  0 

ij 

Choose  i  as  the  p^th  row;  set  pQ  =  pg  +  1 
Choose  i  as  the  V  th  colunn;  set  v,.  =  v  +1 

J  0  0  o 

Set  -  1  for  each  remaining  row  k 


such  that  A,  j*  0 


3.2:  CALL  SPIKE 


4:  IF  Vg  v,  mark  remaining  columns  to  be  interperiod  spikes. 

SPIKE  is  a  recursive  routine  that  is  called  whenever  all  remaining  rows 
and  columns  have  counts  of  at  least  2.  It  can  be  adapted  to  non-square 
A  with  only  minor  changes: 

SPIKE :  SI:  CALL  CHUZS  to  choose  a  remaining  column  s  to  be  a  spike 

Set  R^  =  R^  -  1  for  each  remaining  row  k  such  that  A^s  4-  0 

S2 :  REPEAT  WHILE  pQ  <.  U,  vQ  <  v  : 

S2.1:  IF  R^  >  1  for  every  remining  row  i:  CALL  SPIKE 


S2.2:  If  R^  =  1  for  some  remaining  row  i: 

CALL  CHUZJ  to  pick  a  column  j  for  which  A  j  0 
and  R^  =  1 

Choose  i  as  the  Pgth  row;  set  Pg  =  Pg  +  1 
Choose  j  as  the  v^th  column;  set  Vg  =  +  1 

Set  R^  =  R^  -  1  for  each  remaining  row  k  such 
that  A^j  ^  0 
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S2.3:  If  =  0  for  some  remaining  row  i: 

Choose  i  as  the  y^th  row;  set  Uq  =  Pq  +  1 
Make  s  the  v^th  column;  set  =  Vq  +  1 
RETURN 

For  best  results,  CHUZS  should  pick  a  column  s  that  intersects  many 
low-count  rows;  then  there  is  a  good  chance  that  one  of  these  rows 
will  soon  be  reduced  to  a  count  of  zero,  whereupon  SPIKE  will  reach 
step  2.3  and  return.  The  same  applies  to  CHUZJ ;  in  particular,  step  2.3 
will  be  reached  immediately  if  the  chosen  column  j  intersects  two  or 
more  1-count  rows.  Consequently,  both  CHUZS  and  CHUZJ  may  be  as  follows 

CHUZS,  CHUZJ: 

Cl:  FOR  each  remaining  column  k: 

FOR  l  FROM  1  TO  (v-vQ): 

SET  T^(k)  =  number  of  remaining  rows  i  such  that 

A..  ^  0  and  R,  =  l 
lk  i 

C2:  RETURN  a  column  k  for  which  the  vector 

(T  (k),  T„(k),  ...  ,  T  (k))  is  lexicographically 

0 

maximal  among  all  remaining  columns  k 

(In  their  original  proposal  [27]  Hellerman  and  Rarick  employ  a  similar, 
but  not  identical,  criterion;  they  also  indicate  how  the  recursion  of 
SPIKE  may  be  implemented  efficiently.) 
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The  output  of  P3  is  a  reordering  of  A.  The  chosen  pivots  are  the 
elements  (y,v)  or  (^q*v0)  defined  in  steps  2,  3.1.2,  S2.2  and  S2.3; 
these  are  exactly  the  elements  that  lie  on  the  diagonal  of  the  reordered 
A  if  unpivoted  rows  and  interperiod  spikes  are  disregarded. 

Canonical  reduction  of  staircase  blocks 

Provided  that  the  staircase  diagonal  blocks  are  of  moderate  size — 
say,  at  most  100  or  150  rows — applying  P3  to  each  block  should  be  adequately 
efficient.  When  the  diagonal  blocks  are  very  large,  however,  it  may  be 
worth  taking  the  following  more  elaborate  approach  that  applies  P3  only 
to  sub-blocks. 

Just  as  a  nonsingular  square  matrix  has  a  square-block-triangular 
reduction,  a  general  m  x  n  matrix  has  a  canonical  reduction  with  non-square 
blocks  along  the  diagonal.  (Dulmage  and  Mendelsohn  showed  the  essential 
uniqueness  of  this  reduction  in  their  study  of  bipartite  graphs  [18,19].) 

A  canonically  reduced  matrix  has  a  block  structure  like  this: 


Each  block  has  more  rows  than  columns,  and  possesses  a  maximum  trans¬ 
versal  (that  is,  there  is  a  oermutation  of  the  rows  of  such  that  all 

diagonal  elements  are  nonzero) .  Each  block  is  square  and  has  a  maxi¬ 

mum  transversal.  Each  E^  has  more  columns  than  rows  and  possesses  a 
maximum  transversal  (there  is  a  permutation  of  the  columns  of  E^  such 
that  all  diagonal  elements  are  nonzero). 

Since  in  practice  a  staircase  block  B^  tends  to  have  a  low 
proportion  of  nonzeroes,  it  is  reasonable  to  expect  that  the  canonical 
reduction  of  will  have  numerous  small  diagonal  blocks  ,  and 

E^ .  If  some  way  can  be  found  to  apply  P3  to  each  of  these  blocks — in¬ 
stead  of  to  B..  as  a  whole — considerable  savings  may  be  realized  when 

Bkk  ls  lar8e- 

Consider,  then,  first  applying  P3  to  one  of  the  blocks  .  Since 
has  a  maximum  transversal,  any  p  columns  of  must  intersect  at 

least  p  different  rows.  As  a  consequence,  P3  cannot  produce  any  inter¬ 
period  spikes  from  :  it  could  do  so  only  by  reducing  a  column  count 
to  zero  in  step  2  or  by  running  out  of  rows  in  SPIKE;  but  either  of  these 
cases  yields  a  set  of  p  columns  that  intersects  fewer  than  p  rows. 

Thus  P3  assigns  a  pivot  on  every  column  of  ,  and  necessarily  leaves 
some  unpivoted  rows  to  be  assigned  pivots  on  interperiod  spikes  from 
previous  periods. 

The  square  blocks  have  the  same  properties  as  the  diagonal 

blocks  of  a  block-triangular  reduction.  Hence  P3  may  be  applied  to  these 


blocks  in  the  usual  way,  and  will  find  a  pivot  on  every  row  and  column. 
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Finally,  consider  applying  P3  to  one  of  the  blocks  E  .  Because 
E^  has  a  maximum  transversal,  any  p  rows  of  must  intersect  at 

least  p  different  columns.  So,  in  this  case,  P3  cannot  leave  any  un¬ 
pivoted  rows:  it  could  do  so  only  by  reducing  a  row  count  to  zero  in 
step  3.1.1,  which  would  identify  a  set  of  p  rows  thac  intersect  fewer 
than  p  columns.  Ulus  P3  assigns  a  pivot  on  every  row  of  ,  and  leaves 
some  colunris  to  be  interperiod  spikes. 

These  observations  suggest  a  four-step  procedure  for  pivot  assign¬ 
ment  in  staircase  block  B.  ,  : 

kk 

Find  the  canonical  reduction  of  B^* 

Apply  P3  to  choose  a  pivot  in  every  column  of  each  block  C^. 
Assign  interperiod  spikes  from  previous  periods  to  pivot  on 
the  leftover  rows. 

Apply  P3  in  sequence  to  D^,  D^,  ...  to  choose  a  pivot  in 
every  row  and  column  of  these  blocks. 

Apply  P3  to  choose  a  pivot  in  every  row  of  each  block  E^. 

Save  the  leftover  columns  to  be  interperiod  spikes  from 
period  k. 

This  scheme  tends  to  pivot  old  interperiod  spikes  in  earlier  rows  of 
^kk’  ^  creates  new  interperiod  spikes  from  the  later  columns. 

Thus  it  promotes  sparsity  as  well  as  efficiency.  It  should  be  noted, 
however,  than  any  of  the  steps  2,  3,  and  A  above  may  be  empty.  In 
particular,  if  has  full  row  rank  then  the  canonical  reduction  has 

no  blocks  C„ ,  and  if  B,  ,  has  full  column  rank  then  there  are  no  E.  . 

Success  of  the  above  procedure  certainly  depends  upon  an  efficient 
algorithm  for  step  1:  the  work  in  finding  i  canonical  reduction  must  not 


(1) 

(2) 

(3) 

(4) 
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exceed  the  savings  in  applying  P3  to  only  the  small  sub-blocks.  Fortu¬ 
nately,  as  is  clear  from  [19],  finding  the  canonical  reduction  is  a 
straightforward  extension  of  finding  a  square-block-triangular  reduction. 
Indeed  the  two-part  algorithm  for  the  latter,  set  forth  in  Section  1,  is 
readily  extended  and  adapted  to  create  a  three-part  algorithm  for  canonical 
reduction  of  an  m  *  n  matrix  A: 

(1)  Find  the  longest  possible  transversal  in  A.  The  transversal 
algorithm  of  [12]  can  be  adapted  to  this  purpose  without 

dif f icul ty . 

(2)  Identify  the  blocks  and  E^.  This  is  a  simple  and  straight¬ 

forward  procedure  described  in  [19]. 

(3)  Find  the  blocks  ,  by  applying  a  strong-components  algorithm 

[16,26]  to  the  submatrix  of  rows  and  columns  that  are  not  in 
any  or  . 

If  experience  with  nonsingular  matrices  can  be  a  guide,  the  time  required 
by  these  steps  should  not  dominate  the  time  required  by  P3. 

Pivot  choice  for  interperiod  spikes 

P3  generally  leaves  (m^  -  rank  B^)  rows  of  B  ^  unpivoted,  while 
v  1 

there  are  (n^-m^)  interperiod  spikes  available  to  pivot  on  these 

rows.  Hence  often  there  is  a  choice  to  be  made:  which  interperiod  spike 
(of  periods  1,  ...  ,  k-1)  should  pivot  on  each  leftover  row  (of  period  k)? 

If  the  staircase  is  fully  dense,  the  answer  is  clear  from 
Proposition  6:  it  just  does  not  matter  which  interperiod  spike  is  assigned 
to  which  row.  The  total  fill-in  is  the  same  regardless. 
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For  the  sparse  staircases  that  arise  in  practice,  nothing  so  definite 
can  be  said.  Proposition  7  gives  an  upper  limit  on  fill-in  that  is  inde¬ 
pendent  of  where  the  interperiod  spikes  are  pivoted;  but  fill-in  could  vary 
considerably  within  the  limit.  Still,  it  is  reasonable  to  expect  that, 
over  a  variety  of  practical  problems,  any  particular  way  of  pivoting  inter¬ 
period  spikes  will  not  be  uniformly  much  better  than  any  other  way.  A 
few  informal  tests  have  tended  to  bear  this  out. 

Thus  any  simple  rule  for  pivoting  interperiod  spikes  should  be 
acceptable.  One  easily  implemented  rule,  for  example,  is  to  always  choose 
the  shortest  spike  available.  Unpivoted  interperiod  spikes  are  stored 
in  a  "stack",  in  order  of  height,  with  the  shortest  at  the  top.  If  there 
are  unpivoted  rows  in  a  spike  is  unstacked  to  pivot  on  the  first 

one,  then  another  is  unstacked  to  pivot  on  the  second,  and  so  forth. 

Then,  if  there  are  new  interperiod  spikes  from  period  k,  they  are  added 
to  the  stack,  shortest  last.  This  is  the  method  used  in  the  computational 
experiments  reported  below. 

Spike  swaps 

Once  an  initial  pivot  order  has  been  arrived  at,  it  remains  to 
actually  carry  out  the  elimination.  In  the  process  some  pivot  elements 
may  be  found  unacceptable,  and  if  so  the  order  must  be  revised.  The 
row  order,  however,  must  be  kept  to  preserve  the  fill-in  properties  of 
Proposition  7  and  the  bump-and-spike  structure.  Thus  any  adjustments 
must  be  made,  as  for  standard  bump-and-spike  methods,  by  exchanging  (or 
"swapping")  columns. 
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As  explained  in  Section  1,  there  are  two  tests  of  each  potential 

00 

pivot  element  8^  .  First  the  pivot's  magnitude  must  be  greater  than 
some  absolute  tolerance  r;  second,  the  pivot's  ratio  to  max^lB^J^I — 
the  largest  remaining  element  in  its  column — must  not  exceed  in  magnitude 
some  (usually  larger)  relative  tolerance  A. 

(k) 

If  k  is  a  triangle  column  it  cannot  fill  in,  and  so  8^  =  [B]^ 

for  all  rows  l  _>  k .  Thus  if  B  is  adequately  scaled  all  pivots  in 
triangle  columns  will  be  acceptable,  and  rejected  pivots  will  be  found 
only  in  the  spike  columns.  (This  was  in  fact  the  case  for  all  test  problems 
described  below  except  GR0W15  and  GROW22,  for  >hich  some  "triangle  swaps" 
were  necessary  to  preserve  stability.) 

A  spike  column's  pivot  can  fail  the  relative  tolerance  test  only 
because  the  pivot  row  is  poorly  chosen  from  a  numerical  standpoint. 

Failing  the  absolute  test  may  also  be  a  sign  of  numerical  problems;  how¬ 
ever,  if  B  is  at  all  well-conditioned,  a  spike  is  much  more  likely  to 
fail  the  absolute  test  because  its  pivot  is  actually  zero.  There  are 
several  ways  that  a  spike  column  k  may  have  a  zero  pivot: 


•  P3  may  assign  a  pivot  element  that  turns  out  to  be  zero: 
sometimes  [Bl^  =  0  and  fails  to  fill  in;  sometimes  [B]^  ^  0 
but  is  cancelled  to  zero. 

•  An  interperiod  spike  may  be  assigned  a  pivot  element  that  turns 
out  to  be  zero.  Indeed,  if  a  spike  from  period  l  pivots  in 
period  1+2  or  later,  necessarily  [B]^  =  0,  and  there  will  be 
a  zero  pivot  unless  it  fills  in. 

•  The  number  of  pivots  chosen  in  block  B  may  be  greater  than  the 
rank  of  B  .  Some  of  the  pivots  must  then  turn  out  to  be  zeroes. 


A  small  number  of  rejected  spikes  is  thus  usually  unavoidable,  no  matter 
how  well-behaved  B  Is  numerically. 


When  the  pivot  in  spike  k  is  rejected,  column  k  must  be  swapped 

(k) 

with  a  later,  more  acceptable  column  j;  B  then  becomes  the  pivot  element. 

Kj 

(k) 

B,  must  be  zero,  however,  if  j  is  a  triangle  column  or  a  spike  from 
k.j 

a  later  period,  and  hence  k  can  only  be  swapped  with  a  spike  in  the 
same  or  an  earlier  period.  It  follows  that,  if  k  is  an  interperiod  spike, 
it  must  be  swapped  with  another  interperiod  spike.  If  k  is  an  intra¬ 
period  spike,  however,  there  are  several  possiblities :  it  may  be  swapped 
with  another  intraperiod  spike;  it  may  be  swapped  with  an  interperiod 
spike  j  of  the  same  period,  in  which  case  k  becomes  an  interperiod  spike 
and  j  becomes  an  intraperiod  spike;  or  it  may  be  swapped  with  an  inter¬ 
period  spike  j  of  a  previous  period,  in  which  case  k  becomes  an  inter¬ 
period  spike  and  j  remains  one.  Proposition  7  suggests  that  none  of 
these  swaps  should  significantly  increase  fill-in  in  the  majority  of  cases, 
since  the  bound  on  fill-in  is  independent  of  the  number  and  position  of 
interperiod  spikes. 

Usually  there  will  be  several  spikes  eligible  for  swapping  with  k, 
and  a  good  choice  will  have  to  be  made  among  them.  To  promote  stability, 
the  new  pivot's  ratio  to  raax^lB^j^l —  the  largest  potential  pivot  in  the 
same  row — should  not  exceed  in  magnitude  the  tolerance  6.  Otherwise, 
choice  of  a  swap  spike  can  be  made  heurist ically ;  any  strategy  appropriate 


to  the  standard  bump-and-spike  method  will  also  be  appropriate  to  the 


staircase  methods  presented  here.  (Tests  described  below  adopt  the  spike¬ 
swapping  strategy  of  MINOS  [47].) 
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Test  groups  1  and  3  (as  described  in  Section  3)  provided  evidence 


upon  the  relative  sparsity,  stability  and  speed  of  the  standard  and  stair¬ 
case  bump-and-spike  elimination  methods.  Experiments  with  the  staircase 
method  employed  the  simpler  version  in  which  P3  is  applied  to  each  entire 
diagonal  block  B^I  other  details  are  given  in  Appendix  B. 

Overall  the  results  were  about  as  expected.  The  staircase  approach 
was  superior  for  large-bump  systems  having  a  good  staircase  structure. 

For  small-bump  systems  the  standard  approach  came  out  ahead. 

The  first  table  below  shows  the  numbers  of  nonzeroes  in  the  L 
and  U  factors,  and  the  fill-in  (the  percentage  increase  in  number  of  non¬ 
zeroes  in  L  and  U  over  number  of  nonzeroes  in  B) .  All  figures  are 
averages  from  test  group  1,  and  may  be  compared  with  the  statistics  in 


Appendix  C: 

L 

U 

%  FILL-IN 

STD 

STAIR 

STD 

STAIR 

STD 

STAIR 

GR0W15 

1965 

1677 

1202 

1268 

427' 

32% 

GROW22 

3742 

3189 

2152 

2452 

36% 

31% 

SCSD8- 

1149 

1126 

237 

410 

26% 

40% 

SGSD8+ 

1196 

1170 

353 

550 

36% 

51% 

SCSD8++ 

1194 

1167 

442 

688 

45% 

64% 

SCAGR25- 

140] 

1355 

54 

268 

6% 

18% 

SCAGR25+ 

1379 

1360 

121 

255 

11% 

20% 

SCRS8- 

1324 

1314 

125 

T85 

17% 

37% 

SCRS8+ 

1433 

1292 

306 

463 

42% 

44% 

SCFXM2- 

2497 

2475 

194 

551 

12% 

26% 

S"TXM2+ 

26  76 

2602 

314 

860 

19% 

38% 

PI LOT- 

5086 

3825 

2  74  7 

3011 

163% 

129% 

PI  LOT+ 

6044 

3844 

3465 

32  74 

219% 

139% 

BRI¬ 

8352 

7293 

2  360 

4576 

142% 

168% 

BED 

9494 

7562 

3037 

74 

5104 

181% 

184% 

»** 


There  is  an  evident  tradeoff  between  sparsity  in  L  and  in  U.  The 
staircase  method  tends  to  give  a  sparser  L,  consistent  with  Proposition 
7's  restrictions  on  fill-in,  but  a  denser  U,  owing  probably  to  fill-in 
within  large  interperiod  spikes.  Hence  either  method  can  be  sparser 
under  the  proper  conditions. 

The  staircase  method  seems  clearly  sparser  for  the  PILOT  systems 
— which  had  the  largest  bumps — and  for  systems  having  the  GROW15-GROW22 
structure.  For  the  large-bump  systems  of  BP1  and  SCRS8+  the  two  methods 
are  roughly  tied;  the  staircase  method  would  almost  surely  improve  on 
BP1,  however,  if  a  better  staircase  partition  were  determined  for  it. 

The  SCSD8  systems  were  the  only  fairly  large-bump  ones  for  which  the 
standard  method  is  definitely  sparser;  it  is  also  sparser  in  all  of 
the  smaller-bump  cases.  In  no  event,  however,  does  the  staircase 
method  produce  a  disastrously  large  fill-in. 

Computational  experience:  stability 

To  gauge  stability,  the  relative  tolerance  test  was  monitored 
and  the  smallest  pivot  ratio  greater  than  6  was  recorded.  In  addition, 
after  each  elimination  the  solution-check  routine  of  MINOS  was  invoked. 
This  routine  uses  the  LU  factorization  to  solve  Bx  **  b,  with  b  the 
LP’s  right-hand  side,  and  reports  the  largest  component  of  the  error 
e  ■  b  -  Bx;  it  also  performs  one  step  of  iterative  refinement,  solving 
By  ■  e  and  correcting  x  to  x  +  y,  and  from  the  relative  sizes  of 
x  and  y  it  determines  a  lower  bound  on  the  condition  number  of  B. 

In  the  test  runs  the  standard  method  has  a  generally  higher 
minimum  pivot  ratio  for  four  of  the  LPs — SCAGR25,  SCRS8,  SCSD8,  and 
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SCFXM2 — but  only  for  SCFXM2  did  It  have  a  smaller  solution-check  error 
as  well.  The  standard  method  also  gave  a  smaller  error  for  GR0W15- 
GROW22  and  PILOT;  these  LPs  had  minimum  pivot  ratios  near  6  for 
both  methods. 

On  the  whole  the  bases  of  SCAGR25,  SCSD8  and  SCRS8  were  well- 
conditioned,  SCFXM2  and  BP1  were  fairly  conditioned,  and  GROW15-GROW22 
and  PILOT  were  poorly  conditioned.  There  was  considerable  variation 
for  some  LPs,  however,  notably  SCAGR25.  Four  bases  had  to  be  refactorized 
with  higher  6  wtien  their  initial  factorizations  failed  a  solution 
check:  both  methods  failed  for  a  SCAGR25  basis, the  standard  method  failed 
on  one  SCRS8  basis  and  one  SCFXM2  basis,  and  the  staircase  method  failed 
on  a  PILOT  basis. 

A  related  statistic  is  the  number  of  spikes  rejected  due  to 
unsatisfactory  pivots: 


REJECTED 

SPIKES 

STD 

STAIR 

GR0W15 

13 

0 

GROW 2 2 

22 

1 

SCSD8- 

2 

3 

SCSD8+ 

2 

7 

SCSD8-H- 

2 

7 

SCAGR25- 

0 

1 

SCAGR25+ 

3 

6 

SCRS8- 

0 

8 

SCRS8+ 

1 

5 

SCFXM2- 

0 

4 

SCFXM2+ 

2 

5 

PI LOT- 

17 

10 

PI  LOT+ 

22 

11 

BP1- 

20 

47 

BP1+ 

18 

45 

The  large-bump,  poorly-conditioned  problems — GROW15-GROW22  and  PILOT — 
show  clearly  fewer  rejected  spikes  in  the  staircase  method;  for  BP1 
the  situation  is  reversed,  probably  because  of  its  poor  staircase 
structure.  Small-bump  problems  show  an  edge  for  the  standard  method 
(as  usual),  but  the  difference  is  rather  small. 

In  sumnary,  it  appears  that  either  method  is  acceptably  stable 
for  linear  programing,  although  the  standard  method  sometimes  looks 
better  statistically.  For  other  applications  a  higher  6  would  be 
advisable. 


Computational  experience:  speed 


Timings  of  the  two  methods  show,  as  predicted,  a  wide  gap  in 
the  efficiency  of  the  P3  heuristic: 


MEDIAN  TIME  IN  P3:  %  OF  ELIM  %  OF  TOTAL 

COLS  IN  SEC/10  ELIM  TIME  IN  P3  TIME  IN  P3 


BIG  BUMP 

STD 

SCTAP2 

1 

.0 

SCRS8 

28 

.2 

SCFXM2 

36 

.2 

SCAGR25 

45 

.4 

SCSD8 

114 

1.1 

BP1 

408 

13.1 

PILOT 

533 

20.4 

STAIR 

STD 

STAIR 

STD 

STAIR 

.7 

0% 

24% 

0% 

2% 

.2 

19% 

17% 

1% 

2% 

.8 

12% 

27% 

1% 

4% 

.2 

27% 

15% 

3% 

2% 

.2 

41% 

12% 

5% 

1% 

3.8 

47% 

10% 

10% 

3% 

2.4 

62% 

25% 

16% 

3% 
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As  the  size  of  the  largest  bump  grows,  the  standard  method’s  time  in  P3 
shoots  up  dramatically;  for  the  last  three  cases  over  40%  of  the  cost  of 
the  elimination  is  in  P3,  and  for  the  worst  case  (PILOT)  P3  was  over  15% 
of  the  total  cost  of  the  simplex  method!  By  contrast,  the  staircase 
approach  keeps  the  cost  of  P3  fairly  level,  and  its  timings  are  particularly 
low  when  there  are  many  small  periods. 


Signif leant 

times 

are  also  spent  in 

actual] y 

computing  the  L  and 

U  factors 

i.  Here  the 

cost  is  mainly  a  function  of  the 

numbers  of  spikes 

and  rejects: 

AVG  SPIKES 

SEC/10 

AVG  SPIKES 

REJECTED 

FACTORIZATIONS 

STD 

STAIR 

STD  STAIR 

STD 

STAIR 

SCTAP2 

5 

9 

0  4 

.3 

1.2 

SCRS8 

13 

16 

0  5 

.2 

.7 

SCFXM2 

28 

29 

0  3 

.5 

1.3 

SCAGR25 

9 

21 

1  4 

.3 

.8 

SCSD8 

24 

30 

1  5 

.6 

.8 

BP1 

121 

147 

20  46 

9.9 

19.4 

PILOT 

93 

97 

21  11 

9.5 

5.8 

For  small- 

-bump  problems  the 

standard  method's 

advantage  is  large  pro- 

portionately,  small  in  absolute  terms.  For  SCSD8  the  difference  in 
factorizing  times  is  swamped  by  the  difference  in  P3  times,  and  for 
PILOT  the  staircase  method  actually  factorizes  faster  owing  to 


fewer  rejects.  (BP1  again  encounters  difficulty  with  a  poor  staircase 
structure.) 

Altogether  Gaussian  elimination  involves  P3,  block-triangular 
reduction  (for  the  standard  method  only),  computation  of  L  and  U, 
and  control  and  utility  routines.  Totals  for  all  of  these  were  as 
follows : 

SEC/10  ELIMINATIONS 


STD 

STAIR 

%  STAIR/STD 

SCTAP2 

1.7 

3.0 

+  76% 

SCRS8 

1.1 

1.4 

+  27% 

SCFXM2 

1.9 

2.8 

+  47% 

SCAGR25 

1.4 

1.6 

+  14% 

SCSD8 

2.7 

1.6 

-  31% 

BP1 

27.9 

26.2 

-  6% 

PILOT 

32.8 

9.7 

-  70% 

The  advantage  switches  clearly  from  the  standard  method,  for  small- 
bump  problems,  to  the  staircase  method  for  large-bump  ones.  Even 
the  BP1  bases  are  handled  slightly  faster. 

These  results  establish  that  standard  bump-and-spike  elimination 
methods  are  unacceptably  slow  for  large-bump  staircase  systems.  For  such 
systems  a  staircase  method  should  produce  an  acceptable  elimination  in 
much  less  time. 
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6.  A  LOCAL-MINIMIZATION  TECHNIQUE  FOR  STAIRCASE  ELIMINATION 
Motivation 

Local-minimization  techniques  for  pivot  selection  are  readily 
adapted  to  find  a  natural  staircase  pivot  order  in  the  sense  of  Propositions  4 
and  5.  Moreover,  the  adapted  techniques  should  require  less  time  and  storage 
than  standard  local-minimization  techniques  applied  to  staircase  matrices. 

A  plan  of  adaptation  can  be  derived  straightforwardly  from 
Proposition  4.  The  proof  of  this  proposition  shows  that,  to  produce  a 
natural  pivot  order,  it  suffices  to  just  restrict  the  choice  of  pivot 
elements  in  certain  ways.  First,  pivot  elements  are  chosen  only  in  the 
rows  and  non-linking  columns  of  period  1;  second,  after  the  non-linking 
columns  have  been  exhausted,  pivots  are  chosen  only  in  the  rows  and 
linking  columns  of  period  1;  third,  after  the  rows  have  been  exhausted, 
pivots  are  chosen  only  in  the  linking  rows  of  period  2  and  the  linking 
columns  of  period  1.  In  this  way  all  columns  of  period  1  are  pivoted. 

The  procedure  then  continues  analogously  for  period  2:  fourth,  only  the 
rows  and  non-linking  columns  of  period  2;  fifth,  only  the  rows  and  link¬ 
ing  columns  of  period  2;  sixth,  only  the  linking  rows  of  period  3  and 
the  linking  columns  of  period  2.  Periods  3  through  t  are  treated  in 
the  same  way. 

The  only  flexibility  in  choosing  a  natural  pivot  order,  therefore, 
lies  in  selecting  pivot  elements  from  the  various  restricted  sets.  This 
selection  is  easily  made  by  any  local-minimization  technique:  at  each 
step,  choose  a  pivot  clement  that  minimizes  some  merit  function  over  all 
potential  pivots  in  the  cuiient  restricted  set.  In  other  words,  local 
minimization  is  adapted  to  find  a  natural  pivot  order  by  simply  restrict¬ 
ing  the  minimization  to  certain  subsets  of  potential  pivots. 
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It  follows  immediately  that  the 
more  efficient  than  full  local  minimizatd 
minimizes  the  merit  function  over  a  much 
to  do . 

It  is  also  easy  to  see  that  the 

have  more  modest  storage  requirements.  £ 

keep  track  of  all  nonzeroes  in  the  remair 

since  fill-in  may  occur  anywhere  within  t 

staircase  order  in  the  sense  of  Propositi 

(k) 

strictly  limited  so  that  most  of  8  d 
when  pivots  are  in  the  columns  of  period 
been  pivoted  already — there  can  be  no  fil 
after  period  2.+1.  Thus  at  most  two  perd 
a  t ime . 

Such  reduced  storage  requirement 

For  the  standard  approach  (as  explained  d 

L  and  U  must  generally  share  storage,  wit 

management  routines  and  a  fair-sized  "cue 

staircase  adaptation,  however,  could  ver} 
(k) 

part  of  8  in  a  separate  region,  savd 

and  storage-cushion  space. 

Algorithm 

It  is  not  hard  to  define  an  algc 
tions.  The  algorithm's  main  steps  are  as 
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2:  REPEAT  FOR  k  FROM  1  TO  m: 

2.1:  IF  any  non-linking  column 
2.1.1:  Minimize  MERIT(i, 
i  a  period-X.  row 
linking  column 
2.1.2:  Choose  a  pivot  (i. 
minimum 

2.2:  ELSE  IF  any  row  of  period 
2.2.1:  Minimize  MERIT(i, 
i  a  period-f.  row  ; 
2.2.2:  Choose  a  pivot  (i^ 
minimum 

2.3:  ELSE  IF  any  column  of  peric 
2.3.1:  Minimize  MERIT (i,_ 
i  a  period-(£+l)  ! 
period-f,  column 
2.3.2:  Choose  a  pivot  (i^ 
minimum 

2.4:  ELSE  SET  f.  =  P.+1 . 

Any  of  the  merit  functions  described  in  Sectic 
for  the  function  MERIT. 

If  the  storage  economies  of  this  algc 
it  is  worth  working  out  exactly  what  part  of 
at  each  step.  In  general,  it  is  only  necessai 


and  columns  of  3 


that  might  possibly  have  filled  in.  These  are 


<k) 

easily  seen  to  depend  only  on  the  current  value  of  ?,  and  on  which  step 
is  being  executed: 

2.1:  Only  period-?  rows  and  columns  can  fill  in. 

2.2:  Only  period-?  rows,  period-(?+l)  linking  rows,  and 

period-?  linking  columns  can  fill  in. 

2.3:  Only  period-(C+l)  linking  rows,  period-?,  linking 

columns  and  period-(?+l)  columns  can  fill  in. 

(k) 

Thus  the  minimum  maintained  part  of  B  is  expanded  in  three  small 

steps  as  the  columns  of  each  period  are  pivoted. 

Simplified  algorithm  fo r  full  staircase 

If  non-linking  rows  and  columns  are  not  distinguished  from  link¬ 
ing  ones,  step  2.1  of  the  preceding  algorithm  may  be  dropped  and  the 
remainder  may  be  simplified: 

FI:  SET  ?  =  1 

F2 :  REPEAT  FOR  k  FROM  1  TO  m: 

F2.1:  IF  any  row  of  period  ?  remains  unpivoted: 

(k) 

F2.1.1:  Minimize  MERIT ( i , j )  subject  to  B^j  /  0, 

i  a  period-?  row  and  j  a  period-?  column 
F2.1.2:  Choose  a  pivot  (i^,j^)  that  achieves  the 
minimum 

F2.2:  ELSE  IF  any  column  of  period  ?  remains  unpivoted: 

(k) 

F2.2.1:  Minimize  MERIT  ( i  ,  j )  subject  to  B^  /  0, 

i  a  period-(?+l)  row  and  j  a  period-? 


F2.2.2:  Choose  a  pivot  (i  ,j  )  that  achieves  the 

minimum 

F2.3:  ELSE  SET  l  =  ?  +1 

This  algorithm  just  "pivots  down  the  staircase"  one  block  at  a  time: 

pivot  elements  are  first  restricted  to  B^,  then  ^ ,  then  B^j 

B32’  B33’  '■*  and  4,0  °n  through  B^  t  t  l>  and  B^ ^ . 

(k) 

It  is  also  easy  to  describe  what  part  of  6  can  fill  in. 
If  the  algorithm  is  in  step  F2.1,  fill-in  is  confined  to  period-? 
and  period-(?+l)  rows  and  period-?  columns.  In  step  F2.2  fill-in  is 
confined  to  period-(?+l)  rows,  and  period-?  and  period-(?+l)  columns. 


Numerical  considerations 

To  make  the  above  algorithms  numerically  stable,  the  minimi¬ 
zation  must  be  confined  to  acceptable  pivots  in  an  absolute  and  relative 

(k) 

sense  as  defined  in  Section  1.  Thus  the  condition  6„  t  0  must  be 
replaced  by  | |  ^  c ;  and  it  is  necessary  to  add  either  a  row  ratio 
test,  |  0^  I  _>  iS*max{  I  8^  I ,  or  a  column  ratio  test,  |  8^  I  5*max^|8^|. 

Proposition  4  guarantees  that  an  acceptable  pivot  order  can 
always  be  found  (if  t;  is  not  set  too  large),  provided  that  the  appro¬ 
priate  ratio  test  is  conducted  at  each  step.  For  example,  in  step 

(k) 

2.1.1  the  minimization  is  over  all  nonzero  elements  of  8  in  the 


period-?  non-linking  columns,  but  over  only  some  nonzeroes  of  s'  '  in 


the  period-?  rows.  Thus  there  must  be  pivots  that  satisfy  the  column- 
ratio  test,  but  it  is  possible  (though  not  likely)  that  all  eligible 


pivots  fail  the  row-ratio  test.  It  follows  that  the  column-ratio  test 


should  be  used  in  2.1.1;  similar  reasoning  shows  that  2.3.1  should  also 
use  the  column-ratio  test,  while  2.2.1  should  use  the  row-ratio  test. 
Stable  versions  of  these  steps  are  thus  as  follows: 


A  similar  analysis  applies  to  the  simplified  full-staircase  algorithm. 

The  row-ratio  test  should  be  used  in  step  F2.1.1,  and  the  column-ratio 
test  in  F2.2.1. 

In  practice  the  arrangement  of  storage  may  make  it  inconvenient 
to  do  both  row  and  column  ratio  tests  (see,  for  example,  [17]).  Instead 
one  of  the  tests  may  be  used  exclusively,  with  provisions  for  some  sort 
of  recovery  in  the  rare  event  that  all  eligible  pivots  fail  the  test. 

The  computations  reported  below  took  this  approach — only  column-ratio 
tests  were  applied — and  encountered  no  difficulty:  there  was  not  a 
single  recorded  case  of  all  eligible  pivots  failing  the  test. 
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Higher-order  staircase  systems 


In  general  the  algorithms  of  this  section  break  down  when  there 

are  nonzero  elements  below  the  staircase:  it  can  easily  happen  that  all 

(k) 

remaining  elements  of  8  in  the  current  restricted  subset  are  zero. 

In  such  an  event  a  pivot  must  be  sought  elsewhere,  preferably  in  a  way 
that  permits  a  return  to  the  natural  pivot  order  as  soon  as  possible. 

Even  if  the  algorithms  are  modified  to  find  pivots  in  every 
case,  the  fill-in  associated  with  a  higher-order  staircase  will  be  more 
extensive.  Consequently  the  advantages  in  storage  and  speed  for  a 
fir  .t-order  system  will  tend  to  be  lost  with  higher-order  ones. 

In  light  of  these  observations  it  seems  that  the  adapted  local- 
minimization  methods  will  mainly  be  useful  for  first-order  staircase 
systems,  while  high-order  systems  are  more  amenable  to  a  bump-and-spike 
approach. 

Computational  experience 

Test  group  2  provided  some  initial  evidence  for  the  staircase 
adaptations.  Statistics  were  collected  for  both  the  reduced  and  full 
algorithms,  as  well  as  for  standard  local  minimization.  Tables  below 
may  be  compared  with  additional  data  in  Appendix  D. 

Average  fill-in  was  as  follows  (STD  indicates  standard,  F.ST. 
full  staircase,  and  R.ST.  reduced  staircase): 
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L 

U 

%  FILL-IN 

STD  F.ST.  R.ST. 

STD  F.ST.  R.ST. 

STD  F.ST.  R.ST. 

GR0W15 

1398 

1321 

1392 

1703 

1775 

1413 

39% 

39% 

26% 

SCSD8- 

762 

888 

887 

529 

612 

626 

26% 

46% 

47% 

SCSD8+ 

858 

1123 

1137 

674 

719 

731 

36% 

63% 

65% 

SCAGR25- 

916 

934 

909 

469 

443 

471 

6% 

5% 

6% 

SCAGR25+ 

776 

892 

889 

684 

616 

614 

8% 

11% 

11% 

SCRS8- 

852 

1193 

1166 

651 

619 

646 

22% 

47% 

47% 

SCRS8+ 

843 

1139 

1116 

643 

632 

673 

22% 

46% 

46% 

SCFXM2- 

1440 

1506 

1438 

952 

960 

1038 

9% 

12% 

13% 

SCFXM2+ 

1488 

1570 

1500 

1285 

1320 

1440 

12% 

17% 

19% 

The  staircase  methods  yielded  significantly  more  fill-in  for  SCSD8  and 
SCRS8,  and  only  slight  more  for  SCAGR25  and  SCFXM2;  for  GR0W15  the  fill- 
in  was  significantly  less  for  the  reduced  staircase  method.  Notably, 
GR0W15  had  the  densest  diagonal  blocks  But  few  nonzeros  in  off- 

diagonal  blocks  SCAGR25  and  SCFXM2  also  had  many  more  nonzeroes 

in  diagonal  than  off-diagonal  blocks.  SCSD8  and  SCRS8  had  the  highest 
proportions  of  off-diagonal  nonzeroes,  which  may  have  contributed  to 
propagation  of  fill-in  down  the  staircase. 

The  full  and  reduced  staircase  methods  gave  about  the  same 
results  except  for  GR0W15:  the  special  structure  of  this  LP  (Appendix  A) 
places  most  of  the  density  in  the  non-linking  columns,  so  that  the 
reduced-staircase  algorithm  yields  a  quite  different  pivot  order.  The 
rel  tive  density  of  L  and  U  fluctuates  between  methods  with  no  apparent 
pattern. 

All  eliminations  were  subjected  the  stability  tests  described 
in  Section  5.  Solution  check  errors  were  uniformly  satisfactory,  and 
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as 


no  factorizations  were  rejected;  the  errors  were  sometimes  smaller  for 
the  staircase  methods  with  SCFXM2,  and  were  definitely  smaller  with  the 
reduced  staircase  method  for  GR0W15.  Minimum  pivot  ratios  were  all 
fairly  small. 

To  make  these  runs,  local-minimization  routines  were  added  to 
the  MINOS  LP  system  in  a  crude  but  simple  way,  as  a  sophisticated  imple¬ 
mentation  would  have  required  much  more  extensive  modifications.  Con¬ 
sequently,  there  was  no  point  in  making  timings.  It  does  seem  fairly 
certain,  however,  that  the  staircase  methods  would  enjoy  some  advantage 
in  speed  and  storage;  the  size  of  this  advantage  remains  to  be  determined. 
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7.  CONCLUSIONS 


Comparison  of  elimination  techniques 

To  compare  structural  and  local-minimization  techniques,  the 
bases  of  test  group  2  were  factored  both  ways.  Resulting  fill-in  was 
as  follows: 


BUMP 

&  SPIKE 

LOCAL  MINIMIATION 

(% 

FILL-IN) 

(%  FILL-IN) 

STD 

STAIR 

STD  F. ST.  R.ST. 

GR0W15 

42% 

32% 

39% 

39% 

26% 

SCSD8- 

25% 

48% 

26% 

46% 

47% 

SCSD8+ 

39% 

53% 

36% 

63% 

65% 

SCAGR25- 

6% 

17% 

6% 

5% 

6% 

SCAGR25+ 

13% 

23% 

8% 

11% 

11% 

SCRS8- 

18% 

39% 

22% 

47% 

47% 

SCRS8+ 

44% 

22% 

46% 

46% 

SCFXM2- 

14% 

27% 

9% 

12% 

13% 

SCFXM2+ 

21% 

43% 

12% 

17% 

19% 

The  general  impression  is  that,  for  these  test  problems  neither 
technique  lias  an  overwhelming  edge;  a  decision  between  them  might  well 
be  made  on  other  grounds  such  as  speed  or  convenience. 

For  three  of  the  problems  there  is  some  indication,  however, 
that  local  minimization  has  a  greater  advantage  on  large-bump  systems 
(SCAGR25+,  SCRS8+,  SCFXM2+)  than  on  small-bump  ones  (SCAGR25-,  SCRS8-, 
SCFXM2-) .  This  stands  to  reason:  success  of  the  structural  approach 
is  predicated  upon  a  structure  of  small  bumps,  and  it  should  get  pro¬ 
gressively  worse  as  bump  size  increases.  Local  minimization  ignores 
the  bump  structure  and  is  relatively  unaffected  by  it. 
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Indications  of  numerical  stability  were  also  about  the  same 


for  the  two  techniques.  Local  minimization  tended  to  produce  somewhat 
smaller  pivot  ratios,  but  solution-check  errors  were  generally  about  the 
same.  The  one  exception  was  GR0W15,  for  which  local-minimization  gave 
much  smaller  errors. 

It  Las  been  observed  that  stability  tends  to  trade  off  against 
fill-in  according  to  the  setting  of  the  relative  tolerance  6  [17,54]: 
a  small  6  (as  in  these  experiments)  ■  tends  to  favor  low  fill-in,  while 
a  large  6  favors  stability.  Hence  the  above  comparisons  are  only 
suggestive;  results  for  a  range  of  <5  values  will  be  required  for 
more  definitive  conclusions. 

As  for  the  relative  speed  of  these  two  techniques,  virtually 
nothing  is  known:  they  are  almost  never  implemented  together,  and  no 
comparable  timings  have  been  reported.  A  careful  parallel  implementa¬ 
tion,  perhaps  within  a  linear-programming  code,  will  be  needed  to  properly 
contrast  the  efficiency  of  structural  and  local-minimization  approaches. 

Outlook  for  staircase  techniques 

The  theoretical  and  experimental  results  of  this  paper  suggest 
that  staircase  elimination  approaches  have  three  potential  advantages 
over  standard  sparse-elimination  techniques. 

First,  staircase  methods  can  produce  sparser  L  and  U  factors  for 
certain  kinds  of  staircase  systems.  When  bump-and-spike  structural 
techniques  are  employed,  the  staircase  adaptation  appears  sparser  on 
"large-bump"  systems — ones  whose  block-triangular  reductions  have  a  few 
large  diagonal  blocks.  The  staircase  variants  of  local  minimization 


90 


may  be  sparser  when  the  staircase  is  fairly  dense,  and  especially  when 
the  density  is  concentrated  is  the  diagonal  blocks  of  the  staircase. 

(In  other  situations  these  approaches  may  produce  a  somewhat  denser  L 
and  U  than  standard  methods,  but  the  differences  do  not  seem  to  be 
disastrously  large.) 

Second,  staircase  methods  can  be  more  economical.  For  large-bump 
systems,  the  standard  bump-and-spike  technique  spends  so  much  time 
analyzing  the  bump  that  it  becomes  prohibitively  expensive  compared  to 
the  staircase  version.  (For  small-bump  systems  the  standard  technique 
is  somewhat  faster,  but  both  methods  are  fast.)  If  local-minimization 
techniques  are  used,  the  staircase  variants  promise  to  require  less 
storage  and  storage-management  time  for  all  but  the  sparsest  systems. 

Third,  staircase  methods  find  natural  pivot  orders — in  which 
most  or  all  of  each  period  is  pivoted  before  the  following  ones — and 
such  orders  may  be  taken  advantage  of  to  solve  certain  linear  systems 
faster.  This  idea  has  been  applied  quite  successfully  to  linear  pro¬ 
gramming  [ 22  ] . 

In  sum,  sparse  Gaussian  elimination  of  staircase  structures  is 
worth  a  special  look.  It  appears  that  many  staircase  systems  could  be 
solved  advantageously  by  special  methods  that  take  the  staircase  structure 
into  account. 
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APPENDIX  A:  TEST  PROBLEMS 

The  linear  programs  used  for  the  computational  experiments  of 
Sections  3,  5  and  6  are  described  in  greater  detail  below.  The  tabular 
summaries  for  each  LP  are  largely  self-explanatory,  but  a  few  general 
notes  are  appropriate: 

All  statistics  except  OBJ  ELEMS  refer  only  to  the  staircase  con¬ 
straint  matrix,  excluding  the  objective  row  and  right-hand  side.  In 
each  case  the  constraint  matrix.  A,  has  been  put  in  reduced  standard  form; 
DIAGONAL  BLOCKS  refers  to  the  staircase  blocks  A^,  OFF-DIAGONAL  BLOCKS 
to  the  blocks  A^^  ^  and  SUB-STAIR  BLOCKS  (when  present)  to  the  blocks 

A  1+2,1  ’  ’  Atr 

Variables  (columns)  are  implicitly  constrained  only  to  be  non¬ 
negative,  unless  there  is  an  indication  to  the  contrary.  BOUNDED  implies 
implicit  lower  and  upper  bounds,  FIXED  implies  fixture  at  a  set  value, 
and  FREE  implies  no  implicit  constraints. 

MAX  ELEM  and  MIN  ELEM  are  the  largest  and  smallest  magnitudes  of 
elements  in  A;  LARGEST  COL  RATIO  is  the  greatest  ratio  of  magnitudes  of 
elements  in  the  same  column  of  A.  Where  values  are  given  BEFORE  SCALING 
and  AFTER  SCALING,  all  tests  were  conducted  with  A  scaled  as  described 
in  Appendix  B.  Otherwise  NO  SCALING  is  indicated. 
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GROW 15 


A  simple  dynamic  input-output  LP  constructed  for  test  purposes. 
It  is  based  on  the  following  model:  define  sets, 

IND  set  of  goods 

OBJ  set  of  export  goods:  a  subset  of  IND 
and  parameters 

T  number  of  periods 

a^  units  of  good  i  needed  to  produce  1  unit  of  good  j; 
i,  j  €  IND 


m^  maximum  production  of  good  i  in  a  period;  i  t  IND 

a  proportion  of  m^  that  may  be  stored:  om^  is  maximum 

stock  of  good  i  at  beginning  of  a  period 
Pit  expected  profit  per  unit  of  good  i  in  period  t; 
i  €  OBJ ,  t  =  1 ,  T . 

Then  the  variables  are 

x  production  of  good  i  in  period  t;  i  €  IND,  t  =  1,...,  T 
stock  of  good  i  at  the  beginning  of  period  t; 
i  €  IND,  t  =  1, . . . ,  T+l 


and  the  LP  is 

maximize 


H-ii 


it OBJ  PitXit 


subject  to 


s 

=  s  +  x  + 

)  a . . x .  ,  it 

IND, t=l , 

i ,  t+l 

it  it 

'J  j  tIND  ij  jt’ 

0  <  x, 

-  it 

<  m^. 

IND,t=l, 

0  <  Sj 
-  it 

<  om^ 

it 

IND , t=l , 
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For  GR0W15,  T  =  15.  The  values  of  a^  and  m^  were  taken  from  a 
20-sector  input-output  analysis  of  the  U.  S.  economy  in  L 2 5 ] .  A  set  OBJ 
of  size  3  was  picked  arbitrarily,  as  were  the  values  p  ;  0  was  set 
at  0.3. 


DIAGONAL  BLOCKS 


PERIOD 

ROWS 

COLS 

ELEMS 

DENS 

1-14 

20 

43 

356 

41% 

15 

20 

43 

356 

41% 

5340 

41% 

OFF- 

-DIAGONAL  BLOCKS 

OBJ 

ROWS 

COLS 

ELEMS 

DENS 

ELEMS 

20 

20 

20 

5% 

3 

3 

280 

5% 

45 

GRAND  TOTALS 
ROWS  300 

COLS  645 

ELEMS  5620 

OENS  2 . 9Z 


(ALL  EQUALITIES) 
(510  BOUNDED) 


COEFFICIENTS 
MAX  ELEM 
MIN  ELEM 

LARGEST  COL  RATIO 


NO 

SCALING 

1.0 

6.0  x  io" 
1.3  x  IQ5 
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UNCLASSIFIED 


STANFORD  UN IV  CALIF  STSTENS  OPTIMIZATION  LAS  F/»  U/l 

SPARSE  PRUSSIAN  ELIMINATION  OF  STAIRCASE  LINEAR  SYSTEMS. (U) 

NOV  T9  R  FOURER  N0001A-75-C-0EAT 

SOL-79-17  .  NL 


22 


GROW22 

Same  as  GR0W15,  but  with  the  number  of  periods  T  - 


DIAGONAL  BLOCKS  OFF-DIAGONAL  BLOCKS  OBJ 


PERIOD 

ROWS 

COLS 

ELEMS 

DENS 

ROWS 

COLS 

ELEMS 

DENS 

ELEMS 

1-21 

20 

43 

356 

41Z 

20 

20 

20 

5Z 

3 

22 

20 

43 

356 

41Z 

3 

7832 

41Z 

420 

5Z 

66 

GRAND  TOTALS 


ROWS 

440 

(ALL  EQUALITIES) 

COLS 

946 

(748  BOUNDED) 

ELEMS 

8252 

DENS 

2.0Z 

COEFFICIENTS 

NO 

SCALING 

MAX  ELEM 

1.0 

MIN  ELEM 

6.0  x  10“6 

LARGEST  COL 

RATIO 

1.3  x  105 
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A  multi-stage  structural  design  problem,  documented  in  [29] 
This  is  the  only  staircase  test  problem  in  which  the  stages  do  not 
represent  periods  of  time. 


DIAGONAL  BLOCKS 


PERIOD 

ROWS 

COLS 

ELEMS 

DENS 

1-38 

10 

70 

130 

19% 

39 

17 

90 

224 

15% 

5164  18% 


OFF-DIAGONAL  BLOCKS 
ROWS  COLS  ELEMS  DENS 

OBJ 

ELEMS 

10  50 

90 

18% 

70 

90 

3420 

18% 

2750 

GRAND  TOTALS 

ROWS 

397 

(ALL  EQUALITIES) 

COLS 

2750 

ELEMS 

8584 

DENS 

0.8% 

NO 

COEFFICIENTS 

SCALING 

MAX  EL EM 

1.0 

MIN  ELEM 

2.4  x  10_1 

LARGEST  COL 

RATIO 

4.0 

IP 


SCAGR25 

Test  problem  reclef ed  from  James  K.  Ho,  Brookhaven  National 
Laboratory,  Upton,  N.7.;  source  not  documented. 


DIAGONAL  BLOCKS 

OFF- 

-DIAGONAL  BLOCKS 

OBJ 

PERIOD 

ROWS 

COLS 

ELEMS 

DENS 

ROWS 

COLS 

ELEMS 

DENS 

ELEMS 

1 

18 

20 

45 

13% 

8 

7 

17 

30% 

19 

2-24 

19 

20 

46 

12% 

8 

7 

17 

30% 

19 

25 

16 

20 

43 

13% 

19 

1146 

12% 

408 

30% 

475 

GRAND  TOTALS 

ROWS 

471  (300  EQUALITIES,  171  INEQUALITIES) 

COLS 

500 

ELEMS 

1554 

DENS 

0.7% 

NO 

COEFFICIENTS 

SCALING 

MAX  EL EM 

9.3 

-1 

MIN  EL EM 

2.0  x  10 

LARGEST  COL  RATIO 

1.9  x  101 

SCRS8 

Derived  from  a  model  of  the  United  States'  options  for  a 
transition  from  oil  and  gas  to  synthetic  fuels;  documented  in  [30, 


PERIOD 

DIAGONAL  BLOCKS 

OFF- 

-DIAGONAL  BLOCKS 

ELEMS 

ROWS 

COLS 

ELEMS 

DENS 

ROWS 

COLS 

ELEMS 

DENS 

1 

28 

37 

65 

6% 

25 

22 

29 

5% 

18 

2 

28 

38 

69 

6% 

25 

22 

29 

5% 

19 

3-5 

31 

76 

181 

8% 

25 

22 

29 

5% 

55 

6-8 

32 

79 

192 

8% 

25 

22 

29 

5% 

58 

9 

31 

79 

189 

8% 

25 

22 

29 

5% 

58 

10-12 

31 

80 

190 

8% 

25 

22 

29 

5% 

59 

13-15 

30 

80 

186 

8% 

25 

22 

29 

5% 

59 

16 

31 

70 

177 

8% 

59 

2747 

8% 

435 

5% 

847 

GRAND 

TOTALS 

ROWS 

490 

(384  EQUALITIES 

,  106  INEQUALITIES) 

COLS 

1169 

ELEMS 

3182 

DENS 

0.6% 

BEFORE 

AFTER 

COEFFICIENTS 

SCALING 

SCALING 

MAX  EL EM 

3.0  x  102 

4.0 

MIN  ELEM 

1.0  x  10"3 

2.5  x  lo"1 

LARGEST  COL 

RATIO 

4.5  x  103 

1.6  x  101 

98 
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SCFXM2 


Test  problem  received  from  James  K.  Ho,  Brookhaven  National 
Laboratory,  Upton,  N.Y.;  source  not  documented. 


DIAGONAL  BLOCKS  OFF-DIAGONAL  BLOCKS  OBJ 


PERIOD 

ROWS 

COLS 

ELEMS 

DENS 

ROWS 

COLS 

ELEMS 

DENS 

ELEMS 

1 

92 

114 

679 

6% 

9 

57 

61 

12% 

13 

2 

82 

99 

434 

5% 

9 

35 

35 

11% 

4 

3 

66 

126 

300 

4 % 

5 

33 

33 

20% 

1 

4 

90 

118 

1047 

10% 

5 

5 

5 

20% 

5 

5 

92 

114 

679 

6% 

9 

51 

61 

12% 

13 

6 

82 

99 

434 

5% 

9 

35 

35 

11% 

4 

7 

66 

126 

300 

4% 

5 

33 

33 

20% 

1 

8 

90 

118 

1047 

10% 

5 

4920 

7% 

263 

13% 

46 

GRAND  TOTALS 

ROWS 

660  (374  EQUALITIES,  286  INEQUALITIES) 

COLS 

914 

ELEMS 

5183 

DENS 

0.9% 

BEFORE 

AFTER 

COEFFICIENTS 

SCALING 

SCALING 

MAX  ELEM 

1.3  x  102 

1.1  x  101 

MIN  ELEM 

5.0  x  10~4 

8.7  x  10"2 

LARGEST  COL  RATIO 

1.3  x  105 

1.3  x  102 
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SCTAP2 

A  dynamic  traffic  assignment  problem,  documented  In  [31]. 
The  LP  has  11  objective  rows;  the  objective  named  OBJZZZZZ  was  used 
in  all  tests.  Statistics  below  omit  the  other  ten  objectives. 


DIAGONAL 

BLOCKS 

OFF-DIAGONAL  BLOCKS 

OBJ 

PERIOD 

ROWS 

COLS 

ELEMS 

DENS 

ROWS 

COLS 

ELEMS 

DENS 

ELEMS 

1-9 

109 

188 

423 

2% 

62 

138 

276 

3% 

141 

10 

109 

188 

423 

2% 

141 

4230 

2% 

2484 

3% 

1410 

GRAND  TOTALS 


ROWS 

1090  (470  EQUALITIES,  620  INEQUALITIES) 

COLS 

1880 

ELEMS 

6714 

DENS 

0.3Z 

NO 

COEFFICIENTS 

SCALING 

MAX  ELEM 

8.0  x  I01 

MIN  ELEM 

1.0 

LARGEST  COL  RATIO 

8.0  x  IQ1 

100 

J 


PILOT 


Derived  from  a  welfare  equilibrium  model  of  the  United  States' 
energy  supply,  energy  demand,  and  economic  growth:  seeks  maximum  aggre¬ 
gate  consumer  welfare  subject  to  competitive  market  equilibrium.  The  LP 
was  supplied  by  the  PILOT  modeling  project.  Systems  Optimization  Laboratory, 


Department  of  Operations  Research,  Stanford  University;  it  is  documented 
in  [39]. 


SUB-STAIR 

DIAGONAL  BLOCKS  OFF-DIAGONAL  BLOCKS  BLOCKS  OBJ 


PERIOD 

ROWS 

COLS 

ELEMS 

DENS 

ROWS 

COLS 

ELEMS 

DENS 

ELEMS 

DENS 

ELEMS 

1 

84 

343 

686 

22 

31 

74 

105 

52 

18 

02 

10 

2 

90 

345 

1079 

32 

34 

76 

111 

42 

8 

02 

10 

3 

90 

343 

1073 

32 

34 

74 

109 

4% 

5 

02 

10 

4 

90 

343 

1073 

32 

34 

74 

109 

42 

5 

02 

10 

5 

90 

343 

1073 

32 

34 

74 

109 

42 

5 

02 

10 

6 

90 

343 

1073 

32 

34 

74 

109 

42 

3 

02 

10 

7 

90 

343 

1073 

32 

32 

74 

107 

52 

1 

02 

10 

8 

87 

341 

1060 

42 

4 

19 

19 

252 

10 

9 

11 

45 

113 

232 

12 

8303 

32 

778 

42 

45 

02 

92 

GRAND  TOTALS 


ROWS 

722 

(583  EQUALITIES, 

,  139  INEQUALITIES) 

COLS 

2789 

(80  FREE,  296  BOUNDED,  79  FIXED) 

ELEMS 

9126 

DENS 

0.52 

BEFORE 

AFTER 

COEFFICIENTS 

SCALING 

SCALING 

MAX  ELEM 

4.8  *  104 

2.0  x  101 

MIN  ELEM 

1.4  x  10-4 

4.9  x  10~2 

LARGEST  COL 

RATIO 

7.0  x  106 

4.2  x  102 

Developed  by  British  Petroleum,  London;  supplied  via  the  Systems 


Optimization  Laboratory,  Dept,  of  Operations  Research,  Stanford  University. 

This  LP  is  approximately  dual-angular,  with  six  main  diagonal 
blocks  and  about  400  coupling  variables.  For  the  experiments  described 
in  this  paper  it  was  treated  as  a  6-period,  6th-order  staircase  problem. 


DIAGONAL 

BLOCKS 

OFF- 

-DIAGONAL  BLOCKS 

SUB-STAIR 

BLOCKS 

OBJ 

PERIOD 

ROWS 

COLS 

ELEMS 

DENS 

ROWS 

COLS 

ELEMS 

DENS 

ELEMS 

DENS 

ELEMS 

1 

111 

227 

1400 

6% 

3 

60 

3 

2% 

163 

0% 

138 

2 

151 

353 

2175 

4% 

62 

108 

112 

2% 

142 

0% 

149 

3 

113 

321 

964 

3% 

92 

232 

346 

2% 

494 

1% 

270 

4 

170 

295 

2178 

4% 

51 

14 

11 

2% 

4 

0% 

74 

5 

134 

198 

1315 

5% 

111 

2 

2 

1% 

40 

6 

142 

177 

1091 

4% 

56 

9123 

4% 

474 

2% 

803 

0% 

727 

GRAND  TOTALS 

ROWS 
COLS 
ELEMS 
DENS 


821  (516  EQUALITIES,  305  INEQUALITIES) 

1571 
10400 
0.8% 


COEFFICIENTS 


BEFORE 

SCALING 


MAX  ELEM 
MIN  ELEM 

LARGEST  COL  RATIO 


2.4  x  10 
2.0  x  10 
1.7  x  IQ5 


-4 


AFTER 

SCALING 


1.3  x  101 

7.6  x  10" 

1.7  x  102 


APPENDIX  B:  DETAILS  OF  COMPUTATIONAL  TESTS 


Computing  environment 

All  computational  experiments  were  performed  on  the  Triplex  system 
[57]  at  the  Stanford  Linear  Accelerator  Center,  Stanford  University.  The 
Triplex  comprises  three  computers  linked  together:  one  IBM  360/91,  and 
two  IBM  370/168s.  Runs  were  submitted  as  batch  jobs  in  a  virtual-machine 
environment,  under  the  control  of  IBM  systems  0S/VS2,  OS/MVT  and  ASP. 

Each  group  of  test  runs  employed  a  specially-modified  version  of 
the  linear-programming  routines  from  the  MINOS  system  [38,47].  MINOS  is 
written  in  standard  FORTRAN.  For  the  timed  runs  MINOS  was  compiled 
with  the  IBM  FORTRAN  IV  (H  extended,  enhanced)  compiler,  version  1.1.0, 
at  optimization  level  3  [32]. 

Timings 

All  running-time  statistics  are  based  on  "CPU  second"  totals  for 
individual  job  steps  as  reported  by  the  operating  system.  To  promote 
consistency  all  timed  jobs  were  run  or  the  Triplex  computer  designated 
"system  A,"  and  jobs  whose  timings  would  be  compared  were  run  at  about 
the  same  time.  Informal  experiments  indicated  roughly  a  IX  variation 
in  timings  due  to  varying  system  loads. 

More  detailed  timings  employed  PR0GL00K  [33],  which  takes  frequent 
samples  of  a  running  program  to  estimate  the  proportion  of  time  spent 
in  each  subroutine.  To  determine  the  actual  time  in  seconds  for  each 
subroutine,  a  Job  was  run  twice — once  without  PR0GL00K  to  measure  total 
CPU  seconds,  and  once  with  PR0GL00K  to  estimate  each  subroutine's 


proportion  of  the  total.  PROGLOOK  estimates  were  based  on  at  least  2300 
samples  per  job. 


Linear-programming  environment 

MINOS  was  set  up  for  test  runs  according  to  the  defaults  indicated 
in  [38],  with  the  exception  of  the  items  listed  below. 


Scaling.  Problems  noted  as  "scaled"  in  Appendix  A  were  subjected 
to  the  following  geometric-mean  scaling  (where  A  denotes  the  matrix  of 
constraint  coefficients,  not  including  the  objective  or  right-hand  side): 


1:  Compute  pQ  =  maxi A^  j/A.^  j|,  AJ  ^  ^  0  . 

2:  Divide  each  row  i  of  A,  and  its  corresponding  right-hand  side  value, 

1/2 

by  [(min^ I A±^ |) (max^ I ) ]  ,  taking  the  minimum  over  all 

Au  *  °- 


3:  Divide  each  column  j  of  A,  and  its  corresponding  coefficient  in  the 

1/2 

objective,  by  [ (min^ | A^ ^ | ) (max^ | A^ ^ I ) ]  ,  taking  the  minimum 

over  all  A^  j*  0. 


A:  Compute  p 


maxi  A 


iiA2j 


I.  * 


i2J 


+  0. 


The  procedure  is  repeated  as  many  times  as  possible  until,  at  step  4,  p 
is  at  least  90%  of  p^.  (In  other  words,  scaling  continues  as  long  as  it 
reduces  p,  the  greatest  ratio  of  two  elements  in  the  same  column,  by 
more  than  10%. ) 
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Starting  basis.  Runs  that  are  said  below  to  be  from  an  "all¬ 
slack  start"  employed  crash  option  0  of  MINOS:  the  initial  basis  was 
composed  entirely  of  unit  vectors,  and  all  nonbasic  variables  were  placed 
at  their  lower  (or  only)  bounds .  Runs  from  an  "advanced  start"  used  an 
initial  basis  that  had  been  reached  and  saved  in  a  previous  MINOS  run. 

Pricing.  Except  for  SCTAP2,  e  partial-pricing  scheme  of  MINOS 
was  employed — with  one  important  change:  the  arbitrary  partitioning  of 
the  columns  normally  defined  by  MINOS  for  partial  pricing  was  replaced 
by  the  natural  staircase  partition.  Thus  the  periods  of  the  staircase 
were  priced  one  at  a  time  in  a  cyclic  fashion. 

Pricing  for  SCTAP2  was  similar  except  that  the  incoming  column 
was  chosen  from  the  lastest  possible  period.  (This  choice  was  known  to 
produce  a  relatively  small  number  of  iterations  from  an  all-slack 
start. ) 


Refactorization  frequency.  MINOS  was  instructed  to  refactorize 
the  basis  (by  performing  a  fresh  Gaussian  elimination)  every  50  iterations, 
except  for  BP1  (every  75)  and  PILOT  (every  90). 

Elimination  tolerance.  The  "LU  ROW  TOL"  for  MINOS — which  is 

-4 

essentially  the  value  6  defined  in  Section  1 — was  set  to  10  .  Other 

tolerances  were  left  at  their  default  values. 
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The  linear  programs  in  test  group  1  were  used  to  investigate 
fill-in  and  stability  of  structural  elimination  techniques,  as  described 
in  Sections  3  and  5. 

The  special  version  of  MINOS  for  test  group  1  retained  MINOS' 
original  routines  for  standard  bump-and-spike  elimination.  New  routines 
were  added  to  implement  a  version  of  staircase  bump-and-spike  elimination, 
and  to  compile  statistics.  Additionally,  control  routines  were  modified 
so  that  Gaussian  elimination  was  carried  out  twice  at  each  refactorization, 
once  each  with  the  standard  and  staircase  techniques. 

Briefly,  the  new  routines  for  this  version  were  as  follows: 

SF3 — an  adaptation  of  the  P3  heuristic  to  find  spikes  non-square 
or  rank-deficient  blocks,  as  proposed  in  Section  5.  This  routine 
is  a  modification  of  the  original  MINOS  subroutine  P3. 

SP4 — main  routine  for  the  staircase  bump-and-spike  pivot-selection 
method  of  Section  5;  calls  SP3  once  for  each  diagonal  block  of  the 
staircase . 

MAP — a  basis  analyzer:  permutes  the  basis  to  reduced  standard 
form  (Section  2)  and  reports  statistics  on  the  shape  and  density 
of  staircase  blocks. 

In  addition  three  routines  were  modified  to  efficiently  handle  either  a 
standard  or  staircase  bump-and-spike  pattern:  INVERT,  the  main  routine 
for  refactorization;  FACTOR,  an  implementation  of  Gaussian  elimination 
(using  the  preselected  pivots);  and  DSPSPK,  a  utility  routine  that  prints 
a  display  of  the  spike  structure. 


For  most  runs  the  staircase  technique  was  applied  last,  and  the 


resulting  factors  were  used  in  the  following  simplex  iterations. 

However,  SCSD8  was  also  run  with  the  standard  technique  last,  as  the 
factorization  from  the  standard  technique  yielded  a  quite  different 
iteration  path. 

All  LPs  except  PILOT  and  BP1  were  run  from  an  all-slack  start  to 
optimality.  PILOT  and  BP1  were  run  1000  and  750  iterations,  respectively, 
from  advanced  starts. 


Test  group  2 

The  linear  programs  in  test  group  2  were  used  to  investigate  fill-in 
and  stability  of  elimination  techniques  that  perform  a  local  minimiza¬ 
tion,  as  described  in  Sections  3  and  6. 

The  special  version  of  MINOS  for  test  group  2  was  an  augmentation 
of  the  version  for  test  group  1.  Routines  were  added  to  implement  three 
local-minimization  strategies:  standard,  full-staircase,  and  reduced- 
staircase.  Control  routines  were  altered  so  that  Gaussian  elimination 
was  carried  out  five  times  at  each  refactorization,  once  with  each  of  the 
above  strategies  and  once  each  with  the  standard  and  staircase  bump-and- 
spike  techniques. 

The  routines  added  for  local-minimization  strategies  were  briefly 


as  follows: 


MAP — a  basis  analyzer:  permutes  the  basis  to  reduced  standard  form 
(Section  2),  reports  statistics  on  the  shape  and  density  of  staircase 
blocks,  and  constructs  a  "bit  map"  that  is  used  to  keep  track  of  non¬ 
zero  elements  during  the  elimination. 

FACMER — an  implementation  of  sparse  Gaussian  elimination  by  local- 

(k)  (k) 

minimization  techniques,  employing  (r^  -  l)(Cj  -  1)  as  the  merit 

function.  This  routine  can  be  instructed  to  use  the  standard,  full- 

staircase  or  reduced-staircase  strategy.  It  searches  rows  and  columns 

(k)  (k) 

for  the  minimizing  element  in  increasing  order  of  r^  and  c^  ; 
columns  are  searched  before  rows  of  equal  count.  Stability  of  the 
selected  pivot  is  determined  by  a  column-ratio  test  only,  as  explained 
in  Section  6. 

INVMER — main  routine  for  the  local-minimization  techniques,  based 
on  the  MINOS  subroutine  INVERT.  INVMER  calls  FACMER  three  times— 
once  for  each  strategy — and  handles  storage  allocation  and  solution 
checks. 

DRIVER,  the  main  LP  routine  of  MINOS,  was  modified  to  call  both  INVERT 
and  INVMER  and  to  collect  and  print  the  results. 

FACMER  was  intended  as  a  quick  and  simple  implementation  of  local- 
minimization  methods  using  MINOS’  existing  storate  schemes;  hence  it  is 
fairly  slow  and  does  not  make  the  most  efficient  use  of  storage.  A 
sophisticated  implementation  of  FACMER  would  require  much  more  extensive 
modifications  to  MINOS. 
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For  most  runs  the  staircase  bump-and-splke  technique  vas  applied 
last,  and  the  resulting  factors  were  used  In  the  following  simplex 
Iterations.  SCSD8  was  also  run  with  the  standard  bump-and-splke  last  as 
for  test  group  1. 

GR0W15  was  run  from  an  all-slack  start  to  optimality.  The  other 
LPs  were  run  for  240  CPU  seconds  from  an  all-slack  start,  and  also  from 
one  or  more  advanced  starts. 


Test  group  3 

The  linear  programs  of  test  group  3  were  used  to  make  timings 
of  structural  techniques  for  sparse  elimination,  as  reported  In  Section  5. 

The  special  version  of  MINOS  for  test  group  3  employed  the  same 
SP3,  SP4,  FACTOR  and  INVERT  routines  as  test  group  1,  but  only  one 
elimination — either  standard  or  staircase — was  carried  out  at  each  refactor¬ 
ization.  Two  different  runs  were  made  for  each  LP,  one  using  the  standard 
technique  only  and  one  using  the  staircase  technique  only.  Starting  bases 
and  run  lengths  were  as  for  test  group  1. 

Further  details  of  this  test  group  are  presented  in  [22],  which 
reports  much  more  extensive  results  from  the  same  timing  runs. 
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APPENDIX  C:  ADDITIONAL  STATISTICS  FROM  TEST  GROUP  1 


Sparsity 

All  figures  below  are  averages  over  sets  of  bases,  as  explained 
in  Section  3;  values  under  //  indicate  how  many  bases  are  in  each  set. 
ROWS  is  the  number  of  basis  rows  (excluding  the  objective),  and  STR  COLS 
is  the  number  of  "structural"  columns  in  the  basis — remaining  columns 
are  unit  vectors  corresponding  to  slack  variables. 


Figures  under  ELEMENTS  are  numbers  of  nonzeroes  in  the  basis 


B  in  reduced  standard  staircase  form.  Separate  totals  are  given  for  the 
diagonal  blocks  (ON),  the  off-diagonal  blocks  k  (OFF),  the 

sub-staircase  blocks  Bk+2  k  throu8^  (SUB),  and  the  entire  basis 

(TOT) .  Under  %  are  precentage  densities  for  the  diagonal  and  off- 
diagonal  blocks. 


STR  ELEMENTS 


// 

ROWS 

COLS 

ON 

% 

OFF 

% 

SUB 

TOT 

GR0W15 

3 

300 

207 

2146 

36% 

83 

12% 

— 

2229 

GROW22 

5 

440 

398 

4178 

47% 

150 

12% 

— 

4328 

SCSD8- 

21 

397 

383 

665 

16% 

436 

19% 

— 

1101 

SCSD8+ 

11 

397 

391 

684 

16% 

451 

19% 

— 

1135 

SCSD8++ 

12 

397 

393 

688 

17% 

443 

19% 

— 

1131 

SCAGR25- 

12 

471 

362 

1023 

12% 

354 

38% 

— 

1377 

SCAGR25+ 

9 

471 

338 

1005 

11% 

343 

40% 

— 

1348 

SCRS8- 

12 

490 

369 

915 

6% 

326 

8% 

— 

1241 

SCRS8+ 

4 

490 

350 

911 

6% 

309 

9% 

— 

1220 

SCFXM2- 

11 

660 

444 

2282 

4% 

126 

16% 

— 

2408 

SCFXM2+ 

5 

660 

464 

2366 

4% 

139 

15% 

— 

2505 

PILOT- 

3 

722 

690 

2661 

4% 

294 

5% 

25 

2980 

PIL0T+ 

9 

722 

688 

2660 

4% 

291 

5% 

26 

2977 

BP1- 

4 

821 

655 

3971 

4% 

185 

2% 

274 

4430 

BP1+ 

6 

821 

656 

3987 

4% 

190 

2% 

235 

4412 
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Structure 


The  first  values  below  are  the  numbers  of  periods (PER)  and  lower 
square  sub-staircases  (LOU  SSS)  In  the  staircase  bases. 

Remaining  figures  refer  to  the  block-triangular  reductions  of 
the  bases:  number  of  bumps  (BUMPS),  number  of  columns  in  the  largest  bump 
(BIG)  and  in  the  three  largest  bumps  (3  BIG) ,  and  number  of  spikes  in  all 
bumps  (SPKS) . 


LOW 

COLS 

IN  BUMPS 

PER 

SSS 

BUMPS 

BIG 

3  BIG 

SPKS 

GR0W15 

15 

5 

4 

151 

172 

92 

GROU22 

22 

2 

7 

171 

304 

183 

SCSD8- 

39 

4 

16 

83 

129 

28 

SCSD8+ 

39 

2 

15 

133 

192 

34 

SCSD8++ 

39 

0 

14 

180 

224 

36 

SCAGR25- 

25 

2 

7 

27 

45 

7 

SCAGR25+ 

25 

2 

4 

78 

106 

9 

SCRS8- 

16 

1 

13 

33 

67 

19 

SCRS8+ 

16 

1 

6 

117 

139 

16 

SCFXM2- 

8 

1 

21 

36 

82 

32 

SCFXM2+ 

8 

1 

18 

63 

119 

40 

PILOT- 

9 

_ 

9 

455 

514 

94 

PIL0T+ 

9 

5 

538 

547 

92 

BP1- 

6 

_ 

10 

316 

414 

118 

BP1+ 

6 

- 

9 

417 

429 

122 

Ill 
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APPENDIX  D:  ADDITIONAL  STATISTICS  FROM  TEST  GROUP  2 


Sparsity 


All  figures  below  are  averages  over  sets  of  bases,  as  explained 
In  Section  3;  values  under  #  indicate  how  many  bases  are  in  each  set. 
ROWS  is  the  number  of  basis  rows  (excluding  the  objective) ,  and  STR  COLS 
is  the  number  of  "structural"  columns  in  the  basis — remaining  columns 
are  unit  vectors  corresponding  to  slack  variables. 

Figures  under  ELEMENTS  are  numbers  of  nonzeroes  in  the  basis  B 
in  reduced  standard  staircase  form.  Separate  totals  are  given  for  the 
diagonal  blocks  B  (ON),  the  off-diagonal  blocks  B^+1  (OFF),  and 
the  entire  basis  (TOT) .  Under  %  are  percentage  densities  for  the 
diagonal  and  off-diagonal  blocks. 


STR  ELEMENTS 


# 

ROWS 

COLS 

ON 

_%_ 

OFF 

% 

TOT 

GR0W15 

3 

300 

207 

21A6 

36% 

83 

12% 

2229 

SCSD8- 

A 

397 

3A2 

626 

15% 

A01 

20% 

1027 

SCSD8+ 

A 

397 

396 

685 

17% 

AA1 

19% 

1126 

SCAGR25- 

3 

A71 

30A 

1013 

11% 

299 

5A% 

1312 

SCAGR25+ 

2 

A71 

32A 

1028 

12% 

320 

A8% 

13A8 

SCRS8- 

6 

A90 

36A 

917 

6% 

319 

8% 

1236 

SCRS8+ 

3 

A90 

3A7 

909 

6% 

307 

9% 

1216 

SCFXM2- 

A 

660 

395 

2086 

A% 

108 

17% 

219A 

SCFXM2+ 

2 

660 

A  58 

2331 

A% 

138 

15% 

2A69 
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Structure 


The  first  values  below  are  the  numbers  of  periods  (PER)  and 
lower  square  sub— staircases  (LOW  SSS)  in  the  staircase  bases . 

Remaining  figures  refer  to  the  block- triangular  reductions  of  the 
bases:  number  of  bumps  (BUMPS),  number  of  columns  in  the  largest  bump 
(BIG)  and  in  the  three  largest  bumps  (3  BIG) ,  and  number  of  spikes  in 
all  bumps  (SPKS) .  (Entries  marked  *  showed  too  much  variation  to  be 
meaningfully  averaged*) 

LOW  COLS  IN  BUMPS 


PER 

SSS 

BUMPS 

BIG 

3  BIG 

SPKS 

GR0W15 

15 

5 

4 

151 

172 

92 

SCSD8- 

39 

4 

15 

88 

118 

28 

SCSD8+ 

39 

3 

15 

159 

205 

35 

SCAGR25- 

25 

2 

* 

18 

29 

10 

SCAGR25+ 

25 

0 

* 

124 

155 

14 

SCRS8- 

16 

2 

12 

41 

76 

18 

SCRS8+ 

16 

0 

6 

124 

144 

15 

SCFXM2- 

8 

2 

21 

39 

88 

32 

SCFXM2+ 

8 

1 

19 

54 

123 

41 

113 
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